Sep 15, 2014

My Own Four Laws of Computer Programming

Introduction


The other day, while I was in the car with my sleeping baby daughter, I started philosophizing about a discussion  that was going on in a forum about commenting/not-commenting your code and that escalated to me coming up with more than just commenting thoughts: I came up with my own set of four laws of programming.

Parenthesis: I read an article the other day where the author claims that he called his laws "laws" rather than "principles" because contrary to principles, if you break the law there will be consequences. Therefore, I'll present the principles below as "laws" alongside with the "consequences of not obeying those laws.

First Law: Thy Code Shall Be Understandable


Anyone with the appropriate skill set should be able to understand what your code is doing, otherwise only the computer will do so. If the computer is the only entity that can understand your code you are in a terrible place. Although it seems obvious, writing code that is so simple that even someone else can understand is one of the hardest tasks for a programmer.

Code is the output of a complex though process that begins with understanding a problem, devising a solution, generalizing it to a class of problems and coming up with a description of that solution that is simple enough to be translated to a programming language. By the time he gets to that last state, the programmer has already accumulated so much knowledge about a domain that it's easy to assume anyone will have the same knowledge and that's a big mistake. Even the programmer himself is likely to lose all that knowledge after a few months the problem was solved for the first time.

The rules to be followed here are all well-known and all well-forgotten: good design, good naming conventions and good documentation (i.e. document the why's and the what's, not the how's).

Consequences for not following the first law are obvious: code that can't be easily understood will never be easily modified. Which brings us to the...

Second Law: Thy Code Shall Be Modifiable


After making sure you follow the first law so that anyone can understand your code, comes the phase where we find out that the code has to change. And code that is written to be rewritten is also a hard thing to achieve.

Here comes into play the next dimension of the thought process: design. Several techniques can be used to achieve that, no need to go into the details about them. Just keep in mind that if the code isn't flexible enough, the only option is throw it away. We should be able to readily understand that the code is throw-away because in order to reach the second law, you must have passed the first law to begin with.

Making the code modifiable accounts for two important aspects of software development: requirements will change and your code will be wrong several times until it correctly implements the requirements. And the first step to be able to modify the code without breaking it is in the first law: you have to understand what you are modifying. Then you have to verify that it works, which brings us to the...

Third Law: Thy Code Shall Be Testable


Once you have understandable and modifiable code, it comes the time to ensure that it does what it is supposed to do.

Once more, tools and techniques to test software are abundant and out of the scope here. However, the important things that come into play here is that in order to your software to be testable, it has to be unit-testable above all other things. And you must have unit tests for all the positive cases and for most of the negative cases, ideally all (but we all know that 100% coverage is impossible to reach at a reasonable cost). These tests should be executed during the development (several times, if needed) and before submitting the change to a smoke test prior to check-in.

Integration tests should be kept at minimum level and should be end-to-end, scenario-oriented. Ideally, all the important end-user scenarios should be automated, but at this point they should run in a single-box configuration. Any external systems that your software interacts with should still be simulated/emulated/mocked. That's because the costs of deploying complex configurations grows exponentially. Furthermore, reliability of fully-blown configurations is far from 100%. Running tests in such environments is painful as most of the time you'll see your failing due to environmental issues. These tests should run before each check-in.

Finally, a subset of the integration tests should be executed against a real complex configuration, by substituting the simulators/emulators/mocks by the real deal. These are tests that can run in a more relaxed schedule, like daily or so and will only cover the very basic usage scenarios that simply can't break.

Once your tests attest that the system is healthy, and only then, you should worry about the...

Fourth Law: Thy Code Shall Be Scalable


Donald Knuth once said that "premature optimization is the root of all evil in programming". That's so true...

I spent 2 years of my career executing performance tests in enormously complex software, and working with other people who did the same for different parts of the same system. I can guarantee you that most of the time we hit functional issues before we could even start analyzing the performance of the software.

Some people may argue that we need to think about scalability/performance from the beginning and I don't disagree with them. What I might differ from some of them is that I believe that while we must think about it, it should not be before the point where we can attest the functional quality of the system that we start implementing scalability.

Only when we know that the system is understandable, modifiable and testable we can safely perform the changes that will allow our system to scale with the increasing demand. That's because we know exactly how it works, we know exactly what needs to be changed to achieve some higher degree of scale, we know that we can change it and we know that if we break something that will show up before we get to the point that we verify that the system scales better now than before.

Once more, not going into the details of how to execute the performance/load/scalability validations, just that they can only be successfully executed and beneficial when we can guarantee that the system is still doing what it is supposed to do.

Conclusions


Following these "laws" require discipline and patience. It takes experimenting the consequences of not following them to be convinced that they work.