Documenting Bug Fixes

When you fix a bug, how do you document it in the code? One option, of course, is simply not to say anything at all, but just make the fix. But there are times when you want to ensure that later developers understand why the code looks as it does, and don’t undo it.

Consider the following code from the open-source project glassfish-corba:

  // IMPORTANT: Do not replace 'new String("")' with "", it may result
  // in a Serialization bug (See serialization.zerolengthstring) and
  // bug id: 4728756 for details
  if (len == 0) {
      return new String("");
  }

The comment refers to a bug number, but there is no obvious way to find the bug tracking database to which it refers. That’s largely because the code has been moved from repository to repository, most recently from the now defunct java.net to GitHub. Since I happen to know that this code had previously been forked from the JDK, I have been able to track this particular bug down, but you cannot always count on that. It is not uncommon for very long-lived code to be migrated from source repository to repository, as well as from one bug tracking system to another, and comments that assume otherwise can easily wind up referring to databases that no longer exist.

On the other hand, it seems a bit excessive to copy the bug description and analysis into the code as a comment. Future developers don’t need to know every detail of every mistake made earlier, and long comment blocks can confuse at least as easily as they enlighten. This particular case is important because the code looks wrong, and indeed a normal run of the FindBugs tool will flag it, and recommend  that the code be replaced with a constant – which will just reintroduce the original problem.

There is, of course, a simple solution – if you are using clean code practices: unit tests. If you add one or more unit tests that duplicate the issue, they will remain in the codebase, not only to ensure that the problem isn’t reintroduced, but also to document what issues led an odd code choice to be made. And they do this, not because you have set up comments to associate the code with the tests, but because they were added as part of the shame change. Even if you move from one code repository to another, that kind of history should be preserved.

Thus, unit tests can act, not only as a check against regressions, but as executable documentation of the correct behavior of the system.