Wednesday, December 29, 2010

Color Factory: Color scheme generators

Links to this post
While working on a spin-off from bedsider, I created Csster. Alex @ C5 encouraged me (and coached me a bit) on getting through functions around color math, and as far as I know, the functions in Csster are some of the only Javascript implementations. I find the invaluable as I build out Javascript functionality, and I am working on separating them out from Csster itself.


Building my own visualizations, I've needed to bone-up on color theory. I thought I'd share some of what I've learned.

I've always been a interested in color-- mixing paints, collecting "color" items as a child, and design class in college. I've delved in deep in the past and have a good understanding from a theory of it. Most recently I caught up on the computational side of it, writing a Javascript library to help manage colors. To be honest, I still lack the practical experience necessary to have a great intuitive feel for colors, but I'm getting there.

Schemes & Tools
There are numerous software tools out on the web, and available for download, for creating colors schemes. Most are general free-for-alls-- paste in some hex codes please-- and offer no assistance, beyond live preview, tagging, named schemes or Google ads. The best of the bunch is probably the flashy Kuler.

The ones I like better (like Color Scheme Designer) bring to the forefront categories stemming from color theory and a color wheel, selecting a set of colors from a single tangent (monochrome), opposite sides (complementary), analogic (or analogous, for nearby). There are also triadic (3) and a host of variations and combinations of these. It's this type of color theory that designers study and provides the necessary grounding to understand a design and a "scheme" that supports it.


As I poke around building more visualizations, I quickly found that I needed "color schemes"-- not so much color manipulation as coherent sets of colors. I started building some of my own tools, and then stopped, and stepped back to understand some of the theory behind color schemes. One of the easiest from Cynthia Brewer here. Ms. Brewer explains a few different types of color schemes specific to data visualization: binary, qualitative, linear, and divergent.
  • Sequential schemes are suited to ordered data that progress from low to high. Lightness steps dominate the look of these schemes, with light colors for low data values to dark colors for high data values.
  • Diverging schemes put equal emphasis on mid-range critical values and extremes at both ends of the data range. The critical class or break in the middle of the legend is emphasized with light colors and low and high extremes are emphasized with dark colors that have contrasting hues.  
  • Qualitative schemes do not imply magnitude differences between legend classes, and hues are used to create the primary visual differences between classes. Qualitative schemes are best suited to representing nominal or categorical data.
Using these schemes could bring sanity to lots of charting libraries out there-- and it's quite sad that they don't support such technology.

I'm building a "color factory" to provide functions for these. Provide the functions one (or two) colors from your palette and get a cohesive "color scheme" that should work with it-- and even better-- be appropriate to the data. These are all found in what I call the color factory. It's nascent technology, and I'm interested in feedback.

In the next post, I'll dig into the Javascript functions that facilitate this. (All this is part of a generic Javascript sketchbook.)

Thursday, December 23, 2010

jQuery support in Csster

Links to this post
I finally got around to adding a little jQuery plugin for my Csster tool, and released it as version 0.9.2.
$('.selector').csster({ width: 100 });
This looks a lot like the "css" method:
$('.selector').css({ width: 100 });
This difference is that Csster creates a "rule" and inserts it into the head, whereas jQuery will attach styles directly to the nodes. Sometimes you want one, and sometimes another.

It's convenient to use it in the midst of jQuery work, such as:
$('.sidebar').wrapAll('<div>').addClass('ready').csster({backgroundColor: '#ffeedd'});
It's also allows the "nesting" of regular Csster:
$('.hd').csster({ ul: { margin: 10, li: { margin: 0, padding: 5 }}});
Although it warrants its own post, I added a little note about how to implement clean browser-compatible patches. In the example, csster  supports the "opacity" property name in IE by writing a simple Csster plugin that run only within the IE environment and applies the patch. Much nicer than subtler raw CSS solutions... more to come.

Check it out: http://github.com/ndp/csster

Friday, November 26, 2010

Learning Git?

Links to this post
I just revised my previous visualization about git with an eye towards better visual design and usability.

Here's a little history: As I dove into learning git, I was initially confused about where my code was. I felt pretty confident that git hadn't lost anything, but less confident I could get it back readily. Sure, it's distributed, so I expect my code will be more places. But there was also this "index" and "stash"-- how do those relate? It's a little complex coming from Subversion or CVS.

Once I figured out the basic locations that things could be, understanding the commands is a second challenge. The commands tend to work on one or two targets, moving code from one to the other. But they aren't named in any obvious way, except for the "stash" commands. To make sense of these, I mapped them onto the locations. In the visualization, just click on "remote repository" to see all the commands that affect it.

Out of these two frustrations comes my visualization.

Friday, November 19, 2010

Intermittent Selenium Failures

Links to this post
Selenium testing is always a little flakey, but I've* found a good treatment for this on my last two projects. It's pretty simple, really:

If you are using external Javascript services, turn them off.

This includes Google Analytics, Kiss Metrics, Share This, etc. The number of these services has exploded in the last couple years, and it's hard to build a site that doesn't use at least a couple. These tools do what they can to not interfere, but in the fast-paced world of Selenium, they don't always survive. Just remove them for these tests and you'll see marked improvement.

Actually, that reminds me of a good talk I heard the other night. It was by Marcus Westin and Martin Hunt  of Meebo, and they talked about developing he "Meebo Bar". They figured out some really cool tricks to load asynchronously and not interfere with the host website-- but even better, supporting security contexts client side (which is pretty nifty if you think about it.) I actually think you could build a pretty clever SSO (single sign on) solution using these patterns, but I haven't tackled that one yet. Check the slides and presentation.  A must read if you're developing your own widget.

* Actually, credit where credit's due: it was Justin and Jonah (different companies, different projects, not brothers) who identified this problem, not me.

Saturday, September 25, 2010

Introducing Csster

Links to this post
So I'm a bit of a CSS nerd. For years, I've been complaining that there's not enough "engineer" cycles given to CSS. I've written endless blog posts about how to organize your CSS. Blank stares when I ask interview candidates "how do you structure your CSS?"  Well, now we can write CSS in Javascript with Csster, and maybe-- just maybe-- the world has been set right.

Sunday, September 19, 2010

Links to this post
Most resumes are pretty boring: text descriptions of each job, reverse chronological. Every once in a while you find a programmer who "gets creative", rendering their whole resume in C++ syntax or a mindmap. (We are hiring and got one of the creative ones yesterday-- well, I'm not sure what it was, but it managed to communicate I'd never want that person working on a user interface.)

I started to wonder what visualizations would be useful to a reader of a resume.

Monday, April 19, 2010

jQuery Conf SF

Links to this post
After traveling around the Bay Area talking about Javascript Unit testing, I scored a shot at the San Francisco jQuery Conference.  I'll be there Sunday afternoon, talking about "Organizing Your Code with Testable jQuery Plugins".  Stop by and say "Hi"

Saturday, March 6, 2010

Rails Fixtures with Integrity & Validity

Links to this post
A new developer on the project changed the symbolic name of one fixture record  and broke a whole bunch of tests in unexpected ways. Pairing, we discovered a some interesting stuff.

First, if you've never dug into them, it's critical to understand how symbollically named fixtures work. We rely on them heavily, but only yesterday read the code. If you have a fixture like:
bill:
  full_name: ...

And another fixture:
socks:
  owner: bill

Rails magically inserts bill's ID into socks' record. (Before this feature, developers had to manually manage their IDs and keeping fixtures working well was less fun.)


Nifty. I assumed (incorrectly), that there was some sort of lookup of records involved. So if I change the name of the "bill" fixture to something else-- let's say "william", I expect Rails to complain. It doesn't. There's no data integrity to the fixture system-- at all.

We traced though the code and now understand why: When Rails comes across the symbol "bill", it creates an integer hash of it and sticks it into the id column. It'll be a big number like 39384022 or something. Well, if you change the name to "william", it generates a different hash. That's it. At no time does it go back to verify that such a hash exists. It's really just a nice name for a number! Unless your database has constraints that enforce this (which will make loading fixtures more difficult and generally isn't done), you won't see a problem until a test fails.


Once we discovered that, we asked, "wouldn't be nice if there was a test that checked the integrity of the fixtures?" With a little work, we had just such a test, which relies on the validation system:
describe "fixture integrity" do
  ActiveRecord::Base.send(:subclasses).each do |cls|
    it "each fixture for class #{cls} should be valid" do
      cls.find_each do |record|
        record.valid?.should(equal(true),"Invalid fixture for #{cls}:\n  #{record.errors.full_messages.join("\n")}\n#{record.inspect}")
      end
    end
  end
end

Unfortunately that passed until we added a the appropriate validator
class Child < ActiveRecord::Base
  validates_presence_of :parent
  ...

(Make sure you validate "parent", not "parent_id". "parent_id" will be set-- it just won't point to anything.)

That's it. A simple test and you won't risk your fixtures floating too far from the data structure you expect. Thanks to Lowell Kirsh and Jonah for pairing on this.

Saturday, February 6, 2010

Implement Most Popular the Easy Way (hint: use Google Analytics, garb and Rails)

Links to this post
Over the last few years I've implemented "most popular" posts, questions, lists, companies, users, pages, searches, cities, and who knows what else. It's not difficult. I've always implemented this myself-- using a few columns in an SQL database, but found something didn't smell right. We already have this free tool-- Google Analytics (GA)-- which is collecting usage data on my site. Why would I want to store this data redundantly?

In this post, I'll walk you through what we did on my most recent project for the National Campaign. There are three steps: collecting the raw data, processing it into statistics, and displaying it to the user.

Using GA you can completely outsource the data collection. For the statistical analysis, you gain flexibility-- more on that later. Finally, I won't talk at all about displaying the results to users-- that's up to you.

Step 1: Collecting the Data

If you are already using GA, you're done-- you're collecting data. If not, you simply need to create an account and start using it.

Fortunately, using RESTful conventions, most using actions end up being "page views" of some sort. But there might be other steps you want to take. For example, we had a page that served up content via Ajax, and I hadn't bothered to instrument them with GA yet.  I added one line of Javascript to the Ajax callback:  pageTracker._trackPageview(questionLink);  And it can get more complicated: if your definition of popularity involves something beyond your pages, you'll have to dive into the event tracking or custom variables of GA (which I haven't done).

It's worth pointing out that if you don't use GA on a project, you need figure out what data to collect and how to store it. This involves the business owners expressing their requirements, and the developers debating which database table to use and how general a solution to build. You can imagine this can be a small sink-hole if you're not careful.


Step 2: Processing the Statistics

The hard problem to solve here is to create a function that calls out to GA, collects the data, and saves it into your database.

Understanding the API
Although you really don't need a deep understanding of the API, you should at least skim it. The salient points are to understand the differences between accounts, profiles and sites. Run off and read it now.

Install the Rails Gem
You'll need to get "garb"-- not as in trash, but as in Google Analytics for Ruby. Install it:
  gem install garb
or
  config.gem 'garb' # in environment.rb
  rake gems:install

Profile Access

  require 'garb'
  def self.create_profile(acct, username, password)
    Garb::Session.login(username, password)
    Garb::Profile.first(acct)
  end

Retrieving the Data

To retrieve the data, you generate a report. For this, you'll need:
  • dimensions: I just asked for page_path. If you ask for just one dimension, you can think of it as the row headers in the table you get back.
  • metrics: I just asked for the pageviews-- it's the values that go in the row cells returned. You can also ask for bounces, visits, entrances, exits, etc. You can envision a complex definition of popularity that's the pageviews but you could subtract off bounces and exits (or any of the fields).
  • sort: one of the metrics
  • filter: since I was only looking for the question's show action, I filtered out all other pages. The final code for us looks like:
  def self.report_results(profile)
    report = Garb::Report.new(profile)
    report.metrics :pageviews
    report.dimensions :page_path
    report.sort :pageviews
    report.filters do
      contains(:page_path, 'questions')
      does_not_contain(:page_path, 'edit')
      does_not_contain(:page_path, 'new')
      does_not_match(:page_path, '/questions')
      does_not_match(:page_path, '/questions/')
    end
    report.results
  end
You can also add date ranges on this... by default it follows the GA conventions of returning the last month's worth, which is what we wanted.

This function returns an array of Structs, with two properties in each struct: pageviews and page_path.

If you ask for a long report, you'll need to page the results. Refer to the garb documentation to see how to do this.

Saving/Updating the data

  def self.update_page_views(report_results)
    report_results.each do |row|
      if /\/questions\/(\d+)/.match(row.page_path)
        q = Question.find_by_id(Regexp.last_match(1).to_i)
        q.update_attribute(:pageviews,row.pageviews.to_i) unless q.nil?
      end
    end
  end

This logic can be tested using MockRow described above:

class MockRow < Struct.new(:pageviews, :page_path);end
... 
Question.update_page_views [MockRow.new('50',"/questions/333")]

Getting this scheduled and run with the correct credentials is the last piece of the puzzle, which I'm not covering here; see cron, DelayedJob and your host.

Benefits

As you can see, there's nothing that tricky about this. A pair of us had this up and going in a couple hours (although we did spend time getting the DelayedJob running). There are a few important benefits:
  • It's a cleaner architectural with less server load than doing it yourself. You don't need to pollute fundamentally read-only operations with database writes.
  • Better metrics. GA can give you a more sophisticated metric than you could do easily. For example, it's easy to collect raw page views, but collecting unique page views or sessions is a bit more work. 
  • You can change what metrics you use in an ad-hoc basis. For example, you can decide to only count posts from the last week in the most popular, and it's a simple code change. More interesting, you
  • Can eliminate or at least reduce developer and testers from metrics discussions. You won't have to be there to answer "can we make the most popular pages the ones that people spend the most time on?" If it's in GA, you can us it.  If the product owner understands GA, she can figure out how to define "most popular" to produce the results she wants.
  • If you have already been running your site for a while, but are adding most popular support, you may already have a rich set of data on hand. This wouldn't be possible if you rolled your own.
I think this will be a core part of any new site I develop. It's so convenient to have access to this rich data set without any of the burden of collecting it. I hope it works out as well for you,

-- Andy


Other Versions
There are wordpress (drupal, etc.) plugins that do the same thing, but nothing that would work directly for us. For example, http://www.myoutsourcedbrain.com/2009/11/blogger-most-popular-posts-widget.html

Tuesday, January 26, 2010

Assert Changes and Fixture Test Helpers

Links to this post
About a year ago I posted some test helpers for checking pre- and post-conditions during a test. I called them "assert_changes" and "assert_no_changes".  They took a ruby expression to evaluate, a block, and did what you expected:
    o.answer = 'yes'
    assert_changes 'o.answer' => ['yes','no'] do
      o.answer = 'no'
    end

Since then, I have discovered similar functionality in shoulda and rspec. But if you're using test unit, and you don't want to take the leap to shoulda (which is painless), I packaged up my test helpers into a gem that's easy to install. Plus, mine are a little easier to read, if you don't mind evaling strings within your tests.

It's called "ayudante". To install,
    gem install ndp-ayudante --source=http://gems.github.com

Also included in the gem are "fixture helpers". These make it easy to test for sets and lists of your fixture model objects, without the hassle of building up your own lists and sets.

On goBalto, Ingar had created something like it for testing Sphinx search results. We had sets of fixtures, and we wanted to make sure a search returned certain results-- sometimes contains, sometimes exact. Sometimes we didn't care exactly what the order was so we wanted set comparisons. Or we didn't care about extra items.

Here's how it works. If you have a model object and fixtures for CandyBars, it adds test helper methods for assert_list_of_candy_bars (using #method_missing). Like all xUnit asserts, you pass the expected values (a list of symbols identifying the fixture objects), and the value you are testing.

    result = CandyBar.find(:all, ... )
    assert_list_of_candy_bars [:mars, :eminem], result

It also supports assert_set_of_candy_bars, so you can ignore the order of comparisons; and assert_contains_candy_bars so you can make sure it results contain a subset. Enjoy!

Recipe for 5 Whys with an XP Software Team

Links to this post
5 Whys is a great way to get at the root of quality problems. On my last three projects, when I felt like code quality was dropping, I ran a "5 Whys" session. I have found it adds variety, solves a very specific problem, and plugs right in as an alternative to an agile reflection.

It's not in every agile software team's bag of tricks. Asking around our fairy savvy office, I discovered it's far from universal. In the "State of Agile" report from Version One, which includes survey results from 2500 software developers, it wasn't mentioned. Since I haven't seen it show up that much in other agile writings, I thought I'd share my experiences here.

Saturday, January 9, 2010

Pairing with Designers

Links to this post
I've worked on software with designers for 15 years, ever since software had a visual design. Usually this involves being "handed off" designs, or providing "feedback", via email. Only occasionally have I worked side-by-side to solve visual design and interaction problems.

Reflecting back, this seems sad, since working together has all the advantages of pair programming-- it's fun and educational,  often much faster, and you can produce a superior result. 

There are many blog posts about the merits of pair programming, but none about pairing between programmers with a designer.  Since Carbon Five values pairing and collaboration so heavily, I've been trying to do it on all my projects. On my most recent project for the National Campaign, I've had the pleasure of "pairing" with several designers.

The first day of the project was refreshing-- I sat side by side with Jef and we broke apart his Illustrator designs, reassembled them in HTML, and fluidly passed ideas and png files back and forth. He was standing next to me, we were sharing a dropbox, and it was very exciting.

Five months in, a recent experience left Suzanna (our current designer) and me in awe of the merits of working closely together.

The task was a quick page redesign-- we have 17 detail pages utilizing the same basic template, each about a different contraception method (it's a site about birth control). From early user testing, we'd learned what wasn't working, and had reached consensus on the set of problems we should try to fix. Suzanna synthesized the feedback and produced a mock-up-- a single PNG screen mock-up representing all 17 pages. Pretty traditional so far. Among other things, the new design called for altering the main photo from its prominent placement in a rectangle taking up half the page, to a smaller photo that is elegantly wrapped by the initial paragraphs of text.

Traditionally, this doesn't go well: Some of the methods are quite small (the size of a match stick), whereas others, like the female condom, take up 6 inches or so (depending on how you measure). They have different shapes and visual weights, so it's going to be hard to get one design that works for all of them.

Suzanna specified a reasonable size for the image, taking this into account as much as possible. She develops a compromise based on lots of factors, including a  guess about how a browser might wrap text. The developer would follow the spec as close as possible, and either wouldn't notice or say anything if there were visual problems.

Often the process stops there. You have a few pages that look good, and the rest various degrees of bad. Everybody argues that they did their job, and they did.

So the team tries to fix it. The designer might spec different dimensions to see if something else will work-- but this is simply another big compromise. Or the designer will embark on the tedious task of creating a mockup for each page-- making them beautiful with Photoshop's font rendering and text wrapping (which the developer can't really duplicate). In the best case, a compromise will be reached that looks just okay for all the pages.

Things went differently, because the programmer and designer were sitting next to each other:

We are working with a 960 grid, and Suzanna made everything fit in our grid. As I started the implementation, she measured and confirmed. But fitting in a grid for a designer can be different than fitting in a grid for a developer.

I probed, "What about that image-- it's so tall and narrow. You want that to stick up above?"

"Naw... Well, maybe. Why don't we just build the images with 20px "cushion" that the image can bleed into, in case they need to extend?"

"Sure," I said.  I'll use negative margins to break out of the grid (a little):
#photo {
  float: left;
  height: 180px; width: 180px;
  margin: 0 0 -5px -20px;
}
Even if we'd stopped there, our implementation would be an improvement over a traditional outcome without the negative margin bleed. Suzanna could now have shadows extend into the margins, which will be a nice visual effect.

But that's not the end of the story. I walked back to my desk and implemented the page while Suzanna cranked out the images.

She interrupted me five minutes later, "This one's taller, is that okay?"
I was in the midst of placing the image and realized I could add one CSS rule that would handle this. "Sure!"
#photo.the_pill { height: 241px; margin-bottom: -20px; }
Ten minutes later Suzanna came over to my desk, "I finished the images and they're in your drop box. I had to make a couple other ones different sizes."

She'd named them consistently with the old names, so I just copied the files over. We compared my implementation with Suzanna's PNG mockup. Close, but not close enough. Suzanna sat down next to me and we pushed a few pixels around, getting the margins right with the browser's font metrics-- the line heights were slightly different she needed an adjustment. A quick check on Windows as well, and it was perfect.

Then we looked at the odd size images she'd discovered, and we tweaked those sizes and margins as well. Some of them needed a little margin adjustment as well. No problem-- we just flipped back and forth between browsers and CSS files making everything perfect. (She didn't actually need to make any measurements as we just used our eyes.)

While we were at it, we ran through the all 17 pages, just as a quick check. The set of non-exceptional pages should be fine, but Suzanna wanted more adjustments. Each photo is unique. Since it was so easy, why not? In a matter of minutes, half the pages had some custom pixel adjustments, and we were quite happy with all of them.
#photo.diaphragm { height: 195px; margin-bottom: -20px; }
#photo.withdrawal { height: 201px; margin-bottom: -40px;}
#photo.depo_provera { height: 189px; margin-bottom: -20px; }
#photo.iud { margin-top: 5px; height: 226px; margin-bottom: -60px; }
#photo.the_ring { margin-bottom: -20px; }
#photo.cervical_cap { width: 195px; height: 192px; margin-right: -15px; margin-bottom: -20px; }
This probably took less than an hour. This isn't very complicated, but it's hard to imagine how tedious this would have been if we hadn't paired to reach the final result. Plus, each page really looked great-- the image tucked in "just right."

I've certainly collaborated with designers before, poking around in Firebug until things looked good. But this was the first time I saw how clearly working closely produced a superior result. And we did it with no tedious measuring of text or dozens of time-consuming mockups.

What could have taken hours or days took minutes. We just sat down together, and, well, paired.