Donald E. Knuth is credited with stating; “The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming”1. Much has been said both for and against this statement in the 40 odd years since it was coined. I'm no developer so I can't speak too much to the relative merit or weakness of the quote nor how much of the context the quote was made in still applies today. However, I have worked in software development for some time and believe that the sentiment of the quote should be applied to a much wider problem set such as system design; which I will be covering in the rest of this article.
In my view, “Premature optimisation” should be defined as the act of “improving” something without understanding the full implications of the change being made. Scott discussed an example of this in his article “Give it a Break - Part II”. In that particular case, massive performance impacts were introduced by a naive attempt to save some processing time on a busy database (DB) server. It was easy to see the thought pattern, if no update was required for a record, don't send a reply to the origin server. No reply == less work, therefore less CPU time taken up on the DB host. In reality the saving achieved was much, much less than anticipated. After all, the DB server still has to check a record to see if it needs updating to determine if it should send a response or not. This leaves a saving of a single network packet per non-updated record while dropping throughput by 2 orders of magnitude by having the origin server wait for a time-out before sending the next record to update.
Another example of a project where I would consider “premature optimisation” was applied at the design stage was one I was involved with a couple of years ago. It involved moving some complicated business rules from a mainframe into an external rules engine. The idea was sound; move the rules buried deep within the mainframe onto a dedicated rules engine as a middleware service. This would promote reuse of the rules and make maintenance much easier.
The design detail in question was the interface between the mainframe and the rules engine. By design, the rules engine was presented as a middleware service using XML over HTTP. However, the mainframe didn't natively speak XML and the interface to middleware that did was considered very slow and not up to the task. To solve this dilemma it was decreed that a new interface would be built; it would be low level and not use XML (and be yet one more service that would need maintaining over time). The project then spent ~3 person years specifying, building and testing this new interface. Then came time for performance testing …
As it turns out, the new interface was faster than using the middleware interface, slightly. Average round trip time for a calculation dropped from ~35ms via middleware to ~30ms going via the new interface, a reasonable saving. Start of year processing is expected to take less than an hour; the existing processing took just over that on average. Using the rules engine via middleware this dropped to ~47 minutes and ~42 minutes with the new interface. And here we have the crux of the optimisation; 3 person years spent, to add more architectural debt to the system, while only improving processing time by a few minutes for a process that is run annually and already had a solution that was performant enough (just not one that been tested). In this particular case, time would have been much better spent on profiling and tuning the existing mainframe to middleware interface. It is likely enough performance gains could have been found to bring it in line with the new interface. With the added benefit that any application using the mainframe to middleware interface would gain from the tuning.
Both of these are good examples of where “premature optimisations” during design have cost projects time and effort. The first example could have been avoided with some investigation into the assumption that the DB server is “too busy” and thus must do less work. The second example similarly had assumptions around interface latency and throughput that should have been tested before spending so much time re-inventing a slightly faster, although much more complex to maintain wheel. It should be noted here, that these proving activities are not necessarily free. Proving assumptions does take time, effort and generally support from those outside of the project; however it is my view that it is always better to do this proving. Much will be discovered to the benefit of the project and business during this process.
The simple conclusion to draw is to be wary of, as Knuth said “worrying about efficiency in the wrong places and at the wrong times”. Test your assumptions and look at the whole picture before embarking on what could be a costly, low return effort.
Knuth: Computer Programming as an Art CACM, December 1974