Monday, April 4, 2011

Finding entities, characters, glyphs

My latest "mini-project", a few hours in, solves the annoying problem of trying to remember entity numbers or names. For example, our last project used the entity » (»), and it seemed like it took me two years to remember it. Now, with this tool, I can just type ">>" and the character, symbol and number appear. I tried to make it "mobile friendly", and may experiment with packaging it as an "app".

I'm intrigued by "mini-projects"... something I can put together in a few minutes or hours and provide value to someone. It's a long-time obsession: in the 90s when I got my first laptop, I tested myself to see what I put together on my 22 minute BART ride from Oakland to 24th Street. It was fun, but I never creating anything of general interest.

Now, these little tools (and hacks) can be quicker to build out and easier to share. In fact, there are 1000s of "apps" out there, many of which look like they can be done in hours. Anyway, I have a few on my web site.

What makes things easier is there is so much data out there to build upon. As you'd expect, it takes the entity list of W3C. But I supplemented this with interesting Unicode characters, and a nice set from Remy Sharp.

Check it out, bookmark it, and please send me feedback.

7 comments:

Victoryperfect said...

What an exciting experience!/Hilarious! Delightful! True!/wonderful stuff! thank you!
Scrum Process

Vegcar.net said...

Sounds cool Andy. I am fascinated by "mini-apps" too! I have a couple ideas I want to discuss with you - and to catch up! I'll call you.

Lowell Kirsh said...

I like it!

Claire C said...

Great tool, thanks so much. Just discovered and bookmarked. Fab idea!

Kim said...

This is a nice web application! I wonder why some characters aren't represented. For instance, I tried searching for "monkey" or "1f435" or "1F435" and nothing came up. For reference, see http://www.fileformat.info/info/unicode/char/1f435/index.htm

Are there a subset of unicode/html entities not in your project?

Thanks for the cool tool!

ndp said...

@Kim Thanks for the compliment. Yes, this is just a subset-- I somewhat unscientifically went through the code pages and chose ones that would be most interesting to explore. The goal was to make something fast and useful and engaging.

I'm in the process of putting together another version that has "all" the non-unihan characters, and I'll make that available as soon as it's solid. It works, but is 1.8M a bit too large.

That being said, as I use the complete set, I quickly see that many characters are missing from fonts, so it brings up more problems...

ndp said...

@Kim... finally responding to your comment with a new version. I've added many symbols, including the monkeys http://amp-what.com/#q=monkey

This is still a subset, but if you want to use a "full" set of 35K characters, find the link towards the bottom page that says "Click to replace curated list of characters..."