Monday, January 16, 2012

Getting paid by LOCs?

Links to this post
I don't get paid by the lines of code I write. But I've certainly joked about it: there's a rumor that Trevor, who writes way too many comments, gets paid by the line (including comments). Über-programmer Dave things his manhood is measured by the size of his commits. On a recent project the codebase was so large that I joked that every commit should be rejected if it didn't reduce the line count of the project. Haha.

This brings me to the boy scout rule: always leave the code nicer than when you started. This can be a little hard to quantify with a number: test coverage, cyclomatic complexity and coupling metrics are all often used. But with a bloated codebase, a simple line count reduction may suffice to improve the codebase, although is not the main goal. In my casual joking about this with my teammates, they all liked it. But how?

How to Count

If you use git log --stat, you can see the number of lines associated with each commit (after all the details), such as:

331 files changed, 0 insertions(+), 25866 deletions(-)

It's pretty easy to write a script that parses this information out, and that's what I've done in gitlc tool. Beyond the obvious grep, I added support for some necessities:
  • "pairs" set up using the git-pair tool. Commits checked in by multiple people will be credited to all committers. This is necessary with any project where pairing is used.
  • "aliases", so that if the same person commits with different email addresses (or it changes throughout the lifecycle of the project), the commit counts can be aggregated. This optional feature can be used by providing an optional yaml file.
  • commits can be aggregated by person, month, or simply by commit

Usage is pretty simple. Just download it and point it at a local directory with a .git folder. Here's gitlc source itself and ask for a summary by person:

$ ./gitlc.rb -r . -p
[["ndp", {:net=>657, :adds=>26579, :deletes=>25922}]]

As you can see, I had a 25K of code, which I abandoned quickly (thanks node). For projects with more people, it's more interesting:

./gitlc.rb -p -r ../ruby-build/
[["sam", {:net=>499, :adds=>639, :deletes=>140}],
 ["jeremy", {:net=>33, :adds=>50, :deletes=>17}],
 ["josh", {:net=>21, :adds=>41, :deletes=>20}],
 ["jesse", {:net=>16, :adds=>21, :deletes=>5}],
 ["guilleig", {:net=>11, :adds=>53, :deletes=>42}],
 ["bensie", {:net=>6, :adds=>6, :deletes=>0}],
 ["chris", {:net=>6, :adds=>29, :deletes=>23}],
 ["sstephenson", {:net=>5, :adds=>5, :deletes=>0}],
 ...

The tool sorts people by their "net" contribution to the project. For ruby-build, there's a long-tail of people with 1 line changes omitted.  Please check out gitlc and see what you learn about your project.

I'd love to have contributes to this tool. Feel free to fork and contribute back. It's easy to imagine better visualizations and a tool that could be incorporated in a workflow.

Practicalities

When you've got these statistics, and people start to care about their line count number. People ask me what to do if you need to write new functionality? 

I recommend:


  • Look first to share code. In a relatively large codebase, most features should already be there, but perhaps in a slightly different form. Most codebases bloat because we aren't able to figure out the common patterns. We can blame this on a hazy product definition, but at some point it's our responsibility to organize and structure the code ourselves.
  • Look for dead code. What features and code are no longer used?  Get code coverage tools running, which help you identify branches of code that you can remove. Use web analytics to discover pages that are seldom visited and advocate to cut them out. 

Riffing on the last point, a corollary to this blog post is the one that says each story needs to be accompanied by an "unstory". If we add a link to this page, what link do you want to take away? If the user can now spend 20 minutes cropping their profile picture, what aren't they going to be doing? But that's left for another day...

Thursday, May 5, 2011

Do I really have to write tests?

Links to this post
Over at the Lean Startup Circle mailing list, they're discussing  what kind of tests to expect from developers.  I enjoy this conversation. People are often looking for a clear guideline ("startups don't need tests") or "code coverage" figure-- or have one in mind. Idealistic agilists insist you always write tests. They equate not writing tests with abandoning quality-- the beginning of the end. "Such carelessness will lead to bugs upon bugs and eventually, unmaintainable code." Reality requires a more nuanced and pragmatic approach suggested below.

Getting Real

I learned in Q.A. school (or was it the Q.A. streets?), you can't test everything. Be pragmatic because software is just too complex to allow coverage of all cases. Good test design requires thoughtful prioritization-- minimal money yielding maximum test coverage. Q.A. engineers learn the requirements and the relative importance of each. They let this understanding guide how you design a test suite. They learn to carefully consider the quality requirements, where the critical parts of the software is, and where potential problems might appear, and what the costs of failure are. In this cost-benefit and risk analysis, there isn't just one, simple answer.

The main factors to consider to judge how much testing is needed, keeping an eye towards developer-written tests:

Cost of bugs. Different software has different quality requirements. One of my projects would cost $1000s of dollars per hour if it crashed-- and cause freeway backups, and bring the L.A. news helicopters, etc. Another one let you comment on your latest book purchase. Some applications really are heart-lung machines.

Anticipated life of the code. Some code is for prototypes or other "throwaway" purpose. But veterans bristle at the thought, and will recount how quick hacks became the core of mission critical systems, and that code just doesn't go away. Seconding their motion, TDD disciples insist on writing tests for everything. There certainly is risk in creating code too casually, but prototyping is a powerful tool. Prototyping is powerful. If you can figure out how to build out your ideas quickly and explore them with your users,  you'll be able to able to out-pace competitors.

Area of the application. Not all code of an app is the same. At a concert, there are soloists and choir members. The audience will be much more critical of the soloists. Singing coaches don't spend as much time with each choir member as they do with soloists. Therefore, identify the most important part of your app.

I'm a big advocate of building from the inside out: identify these critical parts of the system. Validate the model with everyone, and test the heck out of them. Make it unbreakable. Then, using these as building blocks, put together the pieces in various ways. Allow yourself to test different combinations with the knowledge that the blocks won't break from underneath you. Many a team has treated speculative admin interfaces with the same vigilance as the core data model. It's quick to build prototypes with solid building blocks.

Likelihood of bugs. Some areas of an app are like compost piles, and attract bugs.  Good developers should be able to give you an honest appraisal. They should be able to identify difficult problems, as well as error-prone approaches to problems. Some code just evolves to become gnarly to work with. Adjust testing needs appropriately. 

Test-driven design requirements. TDD is an excellent tool for figuring how the code should be written and structured. I find that using TDD, when informing the design, save significant development effort. Don't sacrifice testing when it will benefit the code. If it's faster to write a prototype with TDD, I'll use it.

Team size. A lone developer focusing on one project will have less of a need for tests than a larger team all sharing a codebase. Yes, with luck, the project will grow into something larger, but there's also an important "just-in-time" theme to agile. This is not an excuse for skipping testing, but is a factor to consider.

Other quality checks.  A team auto-deploying to the cloud needs more checks-- and tests-- than a team with a dedicated Q.A. team watching out for them.

There are other factors, but those are the important ones. Let's resist talking in absolutes, and do what is best for our project, customers and users.

Tuesday, May 3, 2011

Interesting Character Entities

Links to this post
I spent another couple hours on my character entity finder, and wanted to share some of the interesting things I've discovered. I made two main improvements:
(1) lookup happens asynchronously, in small batches, so that it suffers less from "locking up". This allows much more flexible exploration.
(2) allow you to bookmark queries, so you can share interesting discoveries.

To recap, I built this tool to find all the weird quotes:

http://ndpsoftware.com/&what/#q='
http://ndpsoftware.com/&what/#q=quot

Wow, I discovered there's a nice set of chess pieces:

http://ndpsoftware.com/&what/#q=chess

Just a minute ago, I added the ability to bypass the query builder-- if you enter a full regular expression. The regular expression is matched against entity names (and numbers, and nicknames). For example, there are all sorts of icons available:

http://ndpsoftware.com/&what/#q=/97[2-6]\d/

The currency symbols aren't easily found with a single query, but you can build a page with a selection of them:

http://ndpsoftware.com/&what/#q=/currency|euro|dollar|pound/

It's surprisingly fun to play around with. Give it a spin.

Monday, April 4, 2011

Finding entities, characters, glyphs

Links to this post
My latest "mini-project", a few hours in, solves the annoying problem of trying to remember entity numbers or names. For example, our last project used the entity » (»), and it seemed like it took me two years to remember it. Now, with this tool, I can just type ">>" and the character, symbol and number appear. I tried to make it "mobile friendly", and may experiment with packaging it as an "app".

I'm intrigued by "mini-projects"... something I can put together in a few minutes or hours and provide value to someone. It's a long-time obsession: in the 90s when I got my first laptop, I tested myself to see what I put together on my 22 minute BART ride from Oakland to 24th Street. It was fun, but I never creating anything of general interest.

Now, these little tools (and hacks) can be quicker to build out and easier to share. In fact, there are 1000s of "apps" out there, many of which look like they can be done in hours. Anyway, I have a few on my web site.

What makes things easier is there is so much data out there to build upon. As you'd expect, it takes the entity list of W3C. But I supplemented this with interesting Unicode characters, and a nice set from Remy Sharp.

Check it out, bookmark it, and please send me feedback.