CCD Red Degree Principle - Staying DRY

The first Clean Code Developer Red Grade Principle is DRY. I first saw the DRY principle stated in the book The Pragmatic Programmer by Andrew Hunt and David Thomas. DRY stands for Don't Repeat Yourself and refers to a personal philosophy of avoiding duplication not just in code but in data, algorithm and process. It is also phrased as Once and Only Once by the XP community.

The purpose behind the DRY principle is to ensure that every piece of system knowledge should have one authoritative, unambiguous representation.  Having one place for everything (and everything in it's place) makes life easier for maintenance programmers needing to locate a specific piece of code to make changes to. Business knowledge extends beyond code though. Dave Thomas says that system knowledge extends to "database schemas, test plans, the build system, even documentation." - A Conversation with Andy Hunt and Dave Thomas, Part II - Orthogonality and the DRY Principle.

When you first start out you may find that staying DRY is a subtle and difficult thing to do. Easier by far is to locate violations of the DRY principle in your code and make an effort to remove them. Martin Fowler suggests that this process is almost an exercise:

You can almost do this an exercise. Look at some program and see if there's some duplication. Then, without really thinking about what it is you're trying to achieve, just pigheadedly try to remove that duplication. Time and again, I've found that by simply removing duplication I accidentally stumble onto a really nice elegant pattern. It's quite remarkable how often that is the case. I often find that a nice design can come form just being really anal about getting rid of duplicated code.
- A Conversation with Martin Fowler, Part II - Orthogonality and the DRY Principle

Although copying and pasting of code is one easy way of violating the DRY principle there are other more subtle ways. I have worked on a system in the past that was used to manage Service Level Agreements. Each SLA had an effective start date and an effective end date. At various places in the system we needed to discover if one SLA came after another SLA and we used the following piece of code:

if (sla1.StartDate > sla2.EndDate)
// ...

This calculation was duplicated all over the application. It seemed innocuous enough at the time. After all, it's just a simple comparison, how could that be made simpler? Then client told us that their new SLAs tended to start on the same day as the old SLAs ended. That meant we needed >= rather than >. We had to hunt for every place in the application where that calculation had been made and add an = symbol to it. It took hours if not days.

The issue was that we had violated the DRY principle. We had duplicated a piece of system knowledge (i.e. What makes one SLA "after" another?) all over the application and when it turned out that that we had encoded that piece of knowledge incorrectly we needed to hunt down every place we had encoded it and fix it.

There a couple of routes we could have gone down instead to avoid this issue in the beginning. We could have deferred the calculation to a helper class of some kind. That is not very OO but nonetheless it would have worked. That would have looked like this:

if (DateHelper.IsAfter(sla1.StartDate, sla2.EndDate))
// ...

Actually I hate that code for several reasons. Not least is which that I can't tell which parameter is supposed to be "after" the other one at a glance. We could have added a method to the SLA class to internalise what it means for a one SLA to be "after" another. That looks like this:

if (sla1.IsAfter(sla2))
// ...

That is very clean and it works. Although it isn't shown the SLA class still has a reference to start and end dates. A more effective solution may have been to encapsulate those concepts into a DateRange class which has an IsAfter(DateRange other) method on it. As we found every reference in the system we actually made this conversion. Our application was better for it.

The same issue can be seen time and again with concepts like Money. I have seen a few applications which treat money as decimal values. They add them and subtract them and take percentages like mad. And then one day someone says something innocuous like "We need to be able to specify some projects in US Dollars and some projects in Euros. That won't be a problem right?". Simple requirement, weeks of work. Had the original application used some kind of Money class, it wouldn't be an issue at all. In fact there is free java code library called Time & Money that provides one (along with many other useful abstractions).

One more that I see a lot is reading configuration information. Too often on project I have worked on, I (and other people) have read values directly out of configuration files in our code using the APIs in System.Configuration namespace. The issue here is that when the client says something like "We want to consolidate all of the configuration for all of our applications behind this web-service" everything gets hard again. If there had been a single place in the application that we controlled we could easily change the way that our configuration data gets into the application. We could probably also have cached that config somewhere instead of hitting a web service 4 times for every screen. Of course, we eventually had to go back in and make these changes anyway.

One final place that I see people (including me) violating DRY is in their build process. If you need to copy connection strings or some other kind of setting from one configuration file to another in order to change your database or some other kind of config then you are probably violating DRY. One way to get around this is to use build events to copy config files around. All of that starts to get into automation though which we'll get to later.

Be aware of how DRY you are being when you write new code. On a personal level that means being aware of problems you have already solved. Look at every line of code you write and think I should never have to write that line code again. If you aren't thinking that you are probably not remaining DRY enough.

To remain DRY as a team you need to be aware of each problem that your colleagues have already solved. And make them aware of the ones you have solved. You can do that by simple communication channels. Talk about what you have done and what challenges you have overcome. Some kind of internal project blog can help with this. So can mailing lists, frequent stand-up meetings, code reviews and pair programming.

Posted by: Mike Minutillo
Last revised: 27 May, 2011 02:42 PM History


No comments yet. Be the first!

No new comments are allowed on this post.