Thursday, May 5, 2011

Do I really have to write tests?

Links to this post
Over at the Lean Startup Circle mailing list, they're discussing  what kind of tests to expect from developers.  I enjoy this conversation. People are often looking for a clear guideline ("startups don't need tests") or "code coverage" figure-- or have one in mind. Idealistic agilists insist you always write tests. They equate not writing tests with abandoning quality-- the beginning of the end. "Such carelessness will lead to bugs upon bugs and eventually, unmaintainable code." Reality requires a more nuanced and pragmatic approach suggested below.

Getting Real

I learned in Q.A. school (or was it the Q.A. streets?), you can't test everything. Be pragmatic because software is just too complex to allow coverage of all cases. Good test design requires thoughtful prioritization-- minimal money yielding maximum test coverage. Q.A. engineers learn the requirements and the relative importance of each. They let this understanding guide how you design a test suite. They learn to carefully consider the quality requirements, where the critical parts of the software is, and where potential problems might appear, and what the costs of failure are. In this cost-benefit and risk analysis, there isn't just one, simple answer.

The main factors to consider to judge how much testing is needed, keeping an eye towards developer-written tests:

Cost of bugs. Different software has different quality requirements. One of my projects would cost $1000s of dollars per hour if it crashed-- and cause freeway backups, and bring the L.A. news helicopters, etc. Another one let you comment on your latest book purchase. Some applications really are heart-lung machines.

Anticipated life of the code. Some code is for prototypes or other "throwaway" purpose. But veterans bristle at the thought, and will recount how quick hacks became the core of mission critical systems, and that code just doesn't go away. Seconding their motion, TDD disciples insist on writing tests for everything. There certainly is risk in creating code too casually, but prototyping is a powerful tool. Prototyping is powerful. If you can figure out how to build out your ideas quickly and explore them with your users,  you'll be able to able to out-pace competitors.

Area of the application. Not all code of an app is the same. At a concert, there are soloists and choir members. The audience will be much more critical of the soloists. Singing coaches don't spend as much time with each choir member as they do with soloists. Therefore, identify the most important part of your app.

I'm a big advocate of building from the inside out: identify these critical parts of the system. Validate the model with everyone, and test the heck out of them. Make it unbreakable. Then, using these as building blocks, put together the pieces in various ways. Allow yourself to test different combinations with the knowledge that the blocks won't break from underneath you. Many a team has treated speculative admin interfaces with the same vigilance as the core data model. It's quick to build prototypes with solid building blocks.

Likelihood of bugs. Some areas of an app are like compost piles, and attract bugs.  Good developers should be able to give you an honest appraisal. They should be able to identify difficult problems, as well as error-prone approaches to problems. Some code just evolves to become gnarly to work with. Adjust testing needs appropriately. 

Test-driven design requirements. TDD is an excellent tool for figuring how the code should be written and structured. I find that using TDD, when informing the design, save significant development effort. Don't sacrifice testing when it will benefit the code. If it's faster to write a prototype with TDD, I'll use it.

Team size. A lone developer focusing on one project will have less of a need for tests than a larger team all sharing a codebase. Yes, with luck, the project will grow into something larger, but there's also an important "just-in-time" theme to agile. This is not an excuse for skipping testing, but is a factor to consider.

Other quality checks.  A team auto-deploying to the cloud needs more checks-- and tests-- than a team with a dedicated Q.A. team watching out for them.

There are other factors, but those are the important ones. Let's resist talking in absolutes, and do what is best for our project, customers and users.

Tuesday, May 3, 2011

Interesting Character Entities

Links to this post
I spent another couple hours on my character entity finder, and wanted to share some of the interesting things I've discovered. I made two main improvements:
(1) lookup happens asynchronously, in small batches, so that it suffers less from "locking up". This allows much more flexible exploration.
(2) allow you to bookmark queries, so you can share interesting discoveries.

To recap, I built this tool to find all the weird quotes:

http://amp-what.com/#q='
http://amp-what.com/#q=quot

Wow, I discovered there's a nice set of chess pieces:

http://amp-what.com/#q=chess

Just a minute ago, I added the ability to bypass the query builder-- if you enter a full regular expression. The regular expression is matched against entity names (and numbers, and nicknames). For example, there are all sorts of icons available:

http://amp-what.com/#q=/97[2-6]\d/
The currency symbols aren't easily found with a single query, but you can build a page with a selection of them:

http://amp-what.com/#q=/currency|euro|dollar|pound/

It's surprisingly fun to play around with. Give it a spin.

Monday, April 4, 2011

Finding entities, characters, glyphs

Links to this post
My latest "mini-project", a few hours in, solves the annoying problem of trying to remember entity numbers or names. For example, our last project used the entity » (»), and it seemed like it took me two years to remember it. Now, with this tool, I can just type ">>" and the character, symbol and number appear. I tried to make it "mobile friendly", and may experiment with packaging it as an "app".

I'm intrigued by "mini-projects"... something I can put together in a few minutes or hours and provide value to someone. It's a long-time obsession: in the 90s when I got my first laptop, I tested myself to see what I put together on my 22 minute BART ride from Oakland to 24th Street. It was fun, but I never creating anything of general interest.

Now, these little tools (and hacks) can be quicker to build out and easier to share. In fact, there are 1000s of "apps" out there, many of which look like they can be done in hours. Anyway, I have a few on my web site.

What makes things easier is there is so much data out there to build upon. As you'd expect, it takes the entity list of W3C. But I supplemented this with interesting Unicode characters, and a nice set from Remy Sharp.

Check it out, bookmark it, and please send me feedback.

Monday, March 28, 2011

Friendlier Session Timeouts 2.0

Links to this post
As we discovered today, coding up friendly session time-outs involves more than meets the eye. As you know, a session time-out logs the user out after a period of inactivity. But interactions with web sites, and "inactivity," have changed over the last 10-15 years.

We have a fairly plain Rails app, and we're implementing a series of security fixes in anticipation of an audit. In what seemed like a simple "1-pointer", we were asked to remove the "Remember Me" checkbox from our application, and force the user to be logged out after 20 minutes.

For the first pass, we simply set the session timeout to 20 minutes and removed the checkbox (and the underlying implementation). Easy eh?

Sadly, this solution leaves much to be desired. When the user (or Q.A. engineer) returns after 1/2 hour, the page is unchanged and ready for action. But any click redirects, confuses, and potentially loses in-progress work.

Although this might have be acceptable in 1998, it's 2011 and we got Ajax. Clicking a link on a Web 1.0 site takes you to another page; if you happen to be logged out, oh well, sign back in and continue-- you probably weren't doing something that important anyhow.

But these days, on an Ajax page, if your session expires while you're viewing the page, clicking on any element of the page can reveal a session timeout. This may appear to the user as a server error, or if it's handled correctly (like we did), a page redirect. Clicking on a disclosure triangle redirects to a new page? Now we have a surprised user.


I Googled a bit and found lots of bug reports around this behavior. As this Jira snapshot shows, resolving this might be trickier than anticipated.

I checked out my bank's solution. It looked like they implemented a whole timeout scheme in Javascript. Were they wacko? (No... but we'll get there.)

Of the hand full of ideas out there on the web, here's one solution (of a hand full) with the idea of timing the session in Javascript:

<script language="javascript">

function confirmLogoff() {
  if (confirm("Your session will end in one minute.\n\nPress OK to continue for another ten minutes.")){
  location.reload();
  }
}
setTimeout("confirmLogoff()", <%= sessionTimeout - 1000*60 %>);
</script>


At first blush this looks like they might be on to something. Alas no. The confirmation will only work the brief period between the client and the server timeout. I'd argue it's even worse than no solution, since the messages promises something it can't deliver on, and loses the user's work-in-progress.

Did I mention we have quite a few pages with AJAX requests on them? These conveniently extend the session timeout on the server when the user interacts with them. But they break any sort of client-side timer that is set up at page load. Any sort of "meta refresh" schemes doing something similar to the above were quickly dismissed.

And why should just AJAX backed behaviors restart this timer? Opening a hidden panel may or may not go to the server (depending on an developer's whim), but should this whim this really affect the user's timeout? So we started considering implementations that ping the server as the user interacts. This was also dismissed as being complicated to implement efficiently (and potentially introducing some sort of security issue).

Finally where we "settled" (as in prom date), is implementing a timeout within Javascript, like my bank. It's a little more sophisticated: it's reset not only by the initial page load, but all sorts of user interactions. The code finally reduced down to:

var clientSessionTimeout = function(timeoutMS, logoutFn) {
  var lastTimeout;

  var startSessionTimeout = function() {
    if (lastTimeout) clearTimeout(lastTimeout);
    lastTimeout = setTimeout(function() {
          logoutFn();
      }, timeoutMS);
  };

  // Watch for activity
  $('body').click(startSessionTimeout).keydown(startSessionTimeout);
  startSessionTimeout();
};


This is called with the 20-minute timeout, and implements it beautifully:

    clientSessionTimeout(20 * 60 * 1000, function() {
      document.location = '/timeout?return_to=' + document.location.href;
    });


The server side timeout is for redundancy. Our pages are all pretty focused, so I doubt any user will spend more than a couple minutes on any one of them. We picked a server-side session timeout of 40 minutes. The only way the Javascript timeout won't kick in first is if the user interacts with the page for more than this time with no server side interaction... possible, but not likely.

After completing this compromise solution, I'm ready to spell out some ideal requirements:
* session timeout with no interaction after 20 minutes
* any interaction on the page should reset the timeout
* warn the user (if possible) when the deadline approaches
* this shouldn't open additional security vulnerabilities or server traffic

With some additional work, I'm sure an "ideal" solution can be developed. This compromise should get us most of the way there. Thanks to the rest of my team, and an interview candidate who provided some clear thinking on the matter.

Friday, March 18, 2011

Curing Frequent Selenium File Upload Failures

Links to this post
The symptom was quite simple: do an upload, and on the next request the server reports an "IOError". As our Ruby on Rails app is pretty much thin workflow around lots of file uploads, this was a problem. We tended not to see in on production, but us frequent users were seeing it enough to know we had to do something about it.

But the real complainer was Selenium. About half the time the tests failed and needed to be coaxed into running again.

JWinky traced the root cause down to a known bug in the temp file class. With a little work (and encouragement by yours truly), he put together a patch that has eliminated the problem. We've been running with it for a couple months and haven't seen the bug once-- or heard a peep from Selenium.

It's found here: http://github.com/jwinky/ruby_tempfile_ioerror

Tuesday, March 1, 2011

Updated Cheat Sheets

Links to this post
Five years ago, I was working at Great Schools, and got interested in SEO. I started running all sorts of experiments on my own site, to understand how I could affect things. I reorganized the URLs, added keywords, and followed all the standard recommendations. I quickly realized little tweaks to URLs, meta tags, and optimizing keyword density wasn't going to help much. These types of changes really are "optimizations"-- they'll give you a small percentage increase, but they are not game changers. If you've got millions of visitors, a 1% may mean real money, but if you're me, it doesn't matter.

So, after working on SEO, I pursued another idea. Why not create something of real value to drive people to my site? I had an idea and created some "cheat sheets" to help me with my own development. I created them, and then posted them where I could to get some inbound links. Shortly thereafter, someone at O'Reilly found my page and linked to it, and all of a sudden I was getting hundreds of page views per day. So that was my lesson: If I provide something of value, people will come. That was five years ago. Even though technology changes fast, I still have a bit of tail from those original cheat sheets.

Last night, I decided that since of 80% of the people hitting my site are seeing those pages, I should take a look at them and see what impression they might be making. I really don't have a "goal" of driving traffic anywhere else, but I might as well make them look as good as I can. So I cleaned up the visual design and fixed some of the editing.

Check out the spruced up pages here: Hibernate Mapping and JSPx Cheatsheet.

Friday, January 21, 2011

Color Scheming with Javascript

Links to this post
In this post, I'll share some of my Javascript code for manipuating colors.

I started a few months back with basic color manipulation routes. Other libraries take a strictly object-oriented approach. This can be a little heavyweight, as it requires explicit conversions throughout the calling code. But in the HTML DOM, colors are generally expressed as hex strings, and if we have routines were built around these, they would be simpler to use. Plus, Javascript is dynamic language, so a String could have color manipulation methods. That's exactly what I did:
  • '#fff'.toHexColor() => '#ffffff'
  • 'black'.toHexColor() => '#000000'
  • '#123456'.toHexColor() => '#123456' (no op)
  • colorString.toRGB() => array of numbers [0..255]
  • colorString.toHSL() => array of numbers [[0..360],[0-100],[0-100]]
These building blocks aren't that exciting, but are very helpful to build color manipulation functions on:
  • "#ab342c".darken(%) -- make color darker by given percent
  • "#ab342c".lighten(%) -- make color lighter by given percent
  • "#ab342c".saturate(%) -- make color more saturated by given percent. To desaturate, use negative values for the percent. Note that"#ab342c".saturate(-100) renders in grayscale.
Generating new colors
Sometimes, you just need a color to get you started:
  • ColorFactory.random() // a random color, somewhat evenly distributed.
  • ColorFactory.randomGray() // a random gray scale value.
  • ColorFactory.randomHue() // given a saturation and lightness returns a random color.
or  primary and secondary color schemes:


Basic color theory includes the concepts of complementary and analogous colors, so they are provided:

 



And there is a generic "interpolation" beween two colors. It works interpolating hues, saturation or lightness:










Generating Schemes for Visualizations
Now that we have all the building blocks, we can start building schemes for specific purposes. (Schemes get applied in specific ways). For example, if colors are used to represent a quantitative range of values, the colors must visually read as such. The most straightforward way to do this is to linearly lay them out in a monochromatic scheme. Your eye can read "that value is more than that value" because of the visual relationship, in either saturation or lightness.

This differs from qualitatively distinct values (such as states on a modern election map), which must be read as distinct, but not qualitatively related. A triadic approach is more appropriate in this case. A viewer should at no point be enticed into imaging a "red" state is more or less of something than a "blue" state-- they are distinct categories.

Sometimes visualizations include binary, yes/no values. And some data visualizations are about how the values diverge, where values need to read from as quantitatively diverging from a central value. These are challenging to construct so that they read correctly, and cartographers are experts at this.

I've in the process of building a set up Javascript functions that support this type of color scheme generation. It's still a work in progress, but I have found it quite valuable. I'm curious if others find this useful.