Friday, February 05, 2010

Your Code is the Other Team Member ...

One of my big complaints with the code base of Project 2 was the complete lack of a consistent coding style guide.

I'm of the belief that code should read like a book. You should be able to look at a function, class or method and read it from top to bottom like a story. It should have a clear beginning, middle and end. How would you like to a read a book where the formatting was completely inconsistent and you had to struggle with the typesetting to get to the content. That's what happens when source doesn't follow a style guide. Your brain can't move up the semantic ladder because it's mired in syntactic noise.

In order to get to the goal of Collective Code Ownership, everyone needs to see things the same way. Formatting and style is the first step to this goal.

There are a many different code style guides already available. Most are for specific languages: C, C++, C#, Java, Python. While some are variations on these for specific libraries or packages: Gnome, Django, Rails, Linux. And others are company specific style guides like the Google C++ guide and Microsoft Hungarian Notation.

A good coding style should include more than just brace style, comment form and indent conventions, but also things such as:
  • Directory naming
  • File naming
  • Package/Module naming
  • Class, Function and Variable naming (CamelCase? all_lower? AbbrAllowed?)
  • Member variable referencing (m_ vs. _ vs. this.?)
  • Member variable ordering (public, protected, private? Top of class vs. End of class?)
  • Method ordering (alphabetical? public, protected, private?)
  • Import rules and organization (Should all external references be fully qualified? Are blanket includes allowed?)
  • Variable shadowing rules?
  • Versioning standards
  • Copyright and IP ownership notifications
These all need to be addressed! It's very hard to find one standard that will address all of these points for your specific project, so you can expect to roll up your sleeves. Put it on a wiki and have all the developers get notified whenever it changes. Try and pick one that is as close as possible to an existing, widely enforced standard.

With so many definitions it can be a little daunting to pick one. There is one rule you must follow: BE CONSISTENT. While we all have preferences on whether braces should go at the end of the line or the next line, it's far more important to pick one and stick with it.

I saw a great tweet the other day from @sbastn: "I am never going to use the word team unless I really mean it. The word group seems my best replacement"

Your code is your other group member. I don't think you can even think about transitioning from a "group" to a "team" until your code is playing the game too. So, how can you get your group to start considering your code as the other team member?

Many places I've worked have won't permit code reviews to start until the code meets the style guide. "Fix it and I'll come back." Personally I like that rule, but it can frustrate some developers. Enforcing code style can be a daunting, laborious, thankless task ... trust me. Your group is on its way to becoming a team when there is no enforcement required ... everyone just does it because they see the value in it. So, following the mantra of "Automate Everything You Can" (and knowing you can't automate everything) ... there are things you can do with respect to code.

Automated Reformatting
Some groups perform nightly automated reformat of the code base using pretty printers likePyIndent, Jalopy and AStyle. I don't like this approach. If the code changes after you have checked it in, it will look foreign to you when you next look at it. These tools are best used initially to get your legacy code to a good base state.

No Commit Until Compliant
With most revision control systems you can install pre-commit hooks that can verify code compliance first before letting the code into production. I prefer this approach because it makes the developer think about the code style as part of the check-in checklist ... right up there with making sure the tests are written.

Try to Convert the Legacy Codebase in One Fell Swoop
Picking away at a code base to make it match your new style can take forever. You may need to write some custom scripts to help with this (especially when renaming files), but formatting changes are one of the few modifications you can make to a legacy code base where behavior should not change. You have to get into the new code style mindset as quickly as possible and doing this piecemeal only makes it harder.

In summary, establishing and enforcing a coding style on your project should be one of your first steps to fixing a legacy code base. Use automated tools or scripts to help with this project. If you can't automate the effort ask developers not to make the problem worse and to help clean up the mess when they're in fixing something else.

Another thing your programming style guide should address is the documentation style. We'll go over this next time, but in preparation you should read this great series of articles on the topic by Jacob Kaplan-Moss: What to Write, Technical Style and, for large doc efforts, You Need an Editor.


3 comments:

Adam Burry said...

Some of our builds have Checkstyle (http://checkstyle.sourceforge.net/) integrated, but it is not perfect, and I don't know if the jury has decided if it is worth it or not. What sometimes happens is the style rules get in the way of a good job, but I think that will always be a risk with tools like that.

I'm curious about your experiences with automated whitespace layout tools. I've long thought that the issues around religious placement of parenthesis could be dealt with my a filter between developers and source control. In theory, everyone could have their own filter and see things how they like, although that could be an impediment for code review. But I don't buy your "unfamiliar" argument. Were there other issues?

Steve Dinn said...

As far as Microsoft development goes, my team is using StyleCop. Not everyone agrees on all the settings, but that's their problem. They don't call it a cop for nothing :) You don't have an appropriate comment on that method? BROKEN BUILD BITCHES!

You'll learn to love it. Everybody's coding styles will eventually converge to the point where you can't tell who wrote what. It's a beautiful thing.

MyDarkSecret said...

I've yet to see a really good tool for this. Lots of custom scripts. Thanks for the links.

@adam: That's a really interesting idea. Everyone can check out the code in their preferred formatting style. I assume the "unfamiliar" argument you mention is about changing formatting on commit? If so, I wouldn't want to code something in one style and have it change on commit. I mentally need to be involved in the change process.

@steve: I don't know about enforcing comments in code (that's my next post). StyleCop looks good, but it sounds like it's being applied a little too late. I'm suggesting you can't commit to the RCS if you aren't compliant. No broken build since it never makes it to the server. But yes, you don't know who wrote the code, that's the goal!