Matt Grover's blog: October 2011

Saturday, 29 October 2011

Code Complete Review

This post is a review of the book Code Complete: A Practical Handbook of Software Construction by Steve McConnell

Code Complete is an immensely practical book focused on software construction. This means that after reading the 800+ pages, you'll have dozens of tips and new ideas that you can put straight into use in your day to day programming.

Code Complete is the most important book on programming I have ever read. If a budding programmer asked me to recommend a single book to them, this would be it. I was fortunate enough to read Code Complete just a few months into my professional career and it vastly improved my code. My degree is in Computer Science and whilst this made me aware of many tools and techniques in programming and why these techniques are how they are, there was little discussion of how to use the tools well, which is what Code Complete covers perfectly. Whilst reading through the book for a second time, many of the points seemed obvious to but then things are only obvious if you know them and I still found plenty of new and interesting points that I hadn't remembered from the first read through.

Interesting comparisons can be drawn between Code Complete and Clean Code as they are both focused on writing code (though Code Complete is broader). Firstly, Code Complete cites a wealth of research and other books to provide evidence fro the points made, whereas Clean Code's arguments are more along the lines of logical persuasion. This point becomes more involved in places where the two books disagree, for example, in Code Complete, Steve McConnell cites research that says comprehension of a routine doesn't drop until it gets over a couple of hundred lines whereas Clean Code recommends to make routines shorter and then shorter still and most of the examples given in the book are less than 10 lines long (despite the evidence saying comprehensibility isn't reduced in routines 200 lines long, I'm currently more in favour of the Clean Code very short routine style). Commenting is another are where the two books disagree, with Code Complete devoting an entire chapter to comments and recommending leaving the comments in from the Pseudocode Programming Process (more on this later) whereas Clean Code encourages removing as many comments as possible in favour of shorter routines.

One of the main contributions of Code Complete is the Pseudocode Programming Process whereby when writing a routine, Steve McConnell recommends starting with comments describing the intention of that routine and then refining the comments iteratively until it would be easier to write the actual code. Then for each comment left, you write what should be a few relatively simple lines of code and leave the comments in to describe, on a higher level of abstraction, what each section of code does. The authors of Clean Code would change each comment into a routine containing the lines of code associated with it and the routine name would convey the same information that the comment previously did. This would work well as the comments at the end of the PPP should be a single level of abstraction above the code that is written.

Here are the things that I found new or particularly interesting:

Languages (including natural ones) and what you can express in them, may limit your ability to think certain thoughts
Programming "in" a language - limit thoughts to constructs of language vs programming "into" a language - decide what you want to express and then work out how to express those thoughts via the language
Final design should be neat and tidy but the path to tidiness isn't tidy (similar to writing unclean code and then cleaning it as expressed in Clean Code)
To manage complexity - reduce accidental complexity and minimize how much essential complexity you need to remember at any one time
Abstraction - you can look at a higher-level. Encapsulation - you can only look at a high-level
The easier it is to call a module (int terms of setting up the right arguments) the looser the coupling
Don't think of Abstract Data Type's as mathematical objects, but as a way of letting you work in the problem domain rather than low-level implementation
Class = ADT + inheritance + polymorphism
final for functions in Java is equivalent to non-virtual in C++
inheritance tends to contradict reducing complexity
Routines 100-200 lines long are no more error prone than shorter routines
Correctness - never returning an invalid result, better no result at all vs Robustness - try to keep the software operating
Consider creating a project specific base exception class
Original coding (and review!) in pseudocode and leave pseudocode in as comments once code written
Avoid "Just One More Compile" syndrome
If you have to "figure out" a piece of code, refactor it
Use positive boolean variables names to avoid double negatives
Abbreviate consistently - not only just one of num/no. but also don't use Number and Num
Create names you can pronounce
Avoid misspelled words in names as you have to remember the misspelling
Avoid often misspelled words
Centralizing control over things that might change is good
Random accessing into arrays (ie not sequentially) is similar to random gotos
Put the normal case in the if not the else
Prefer for loops to while loops as all the loop control is at the top in one place
Make each loop perform only one function - Only combine loops if measured performance shows you should (and you need the extra performance)
Don't change the index of a for loop inside the loop
Simulate (in head) loop for 1st, last and random middle case to check it
Consider table based approaches instead of complicated ifs/switch statements
Prefer < to > as it orders arguments like a number line
Code reviews more effective than testing because they find cause and symptoms whereas tests only find symptoms
For data-flow testing, have a test case for each DEFINED-USED pair
When debugging, consider how long brute-force techniques such as rewriting the routine would take
"being wring about a change should leave you astonished"
Larger projects will have lower productivity and high error density
Write a core then code and integrate one class at a time
When doing incremental integration, then you need to plan construction earlier
Leave a dyadic operator on the end of a line to indicate the expression carries on

Overall, I would recommend Code Complete to anyone with an interest in programming and software development. Newcomers to the field will have the code improved invaluably by reading the book and experiences practitioners will still be able to learn something and also know that the good habits they have picked up are backed by evidence.

Link to but the book on Amazon.com

Thursday, 20 October 2011

Clean Code Review

This post is a review of the book Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin (and others).

Clean Code opens with the premise that it is not a "feel good", easy reading book, reading the book is not a passive activity but one that involves understanding and critiquing lots of code and thinking about the problems within it yourself before being presented with the authors' opinions of the problems. Whilst I agree with the justification given, I did "feel good" after reading the book, mainly because I had learnt so much and had been given an enthusiasm to get straight to my laptop and start refactoring my code.

For most people, Clean Code will tell you that you're doing something or many things wrong when you're programming. It provides well thought out arguments about why it's wrong to do things in certain ways and offers "cleaner" alternatives. Even if you disagree with the authors on a particular point, the reasoning given by the authors will help you realise when to adapt your approach. Many of the things pointed out as being wrong may well seem obvious yet, after reading the book I started noticing that these problems are everywhere, in my own code, in open-source projects and in advice on forums.

Here are some of the points that I found most interesting or were new to me:

Try to check-in code a little cleaner than when you checked it out
Don't encode the container type into the name (in case you change container type...)
Don't slightly alter a variable name just to satisfy the compiler
Don't add useless additional words to variable names, eg amount in moneyAmount, info in customerInfo
Make functions small and then smaller (though more about in this in my upcoming review of Code Complete)
Blocks within if, else and while statements should be small, preferably just a function call
"If a function does only those steps that are one level below the stated name of the function, then the function is doing one thing" (functions should only do "one thing"!)
The fewer arguments a function has, the better, be suspicious of functions with 3 or more arguments
Encode name of arguments into function eg assertExpectedEqualsActual
If functions are small, then multiple exit points are OK.
Use good naming to make comments redundant and then get rid of the comments
Hierarchically structure functions below the function they are called from
Don't return null, removes the need for the clutter of null checking statements
Create interfaces over 3rd party code to protect yourself from changes
Write tests to discover and document how 3rd party APIs work
3 Laws of TDD:
1. You may not write production code until you've written a failing unit test
2. You may not write more of a unit test than is sufficient to fail (not compiling is failing)
3. You may not write more production code than is sufficient to pass the currently failing test
"Test code is just as important as production code". The only difference between the two should be that test code can be inefficient
Try to minimise the number of asserts in a test (can be achieved by writing custom asserts as long as you can give the custom assert a sensible name)
Classes should have one responsibility - only one thing changing should cause them to change
Can separate classes from a large class by grouping by which member variables they act on
"To write clean code, you must first write dirty code and then clean it"

The last point there is I think particularly important, the authors give the OK to writing code that gets the job done and the tests passing first, and then cleaning it up piece by piece until you have clean code.

The book is somewhat focused on Java, which manifests itself in various ways such as there being a chapter on JUnit and also some of the smells in the book are dependent on having the extensive IDE support that is available for Java. For example, the authors discourage the use of naming conventions to make member variables distinct from local and argument variables, saying that this distinction should be handled by the IDE. Which may not be the case if you're using a newer, less supported language. You certainly shouldn't let this put you off, the amount of general information in the book far outweighs any Java specific information and you can still gain insight from the reasoning behind any Java specific thought.

Possibly the most important part of the book is the collection of code "smells" and heuristics and the end of the book. These are things to look out for in code with explanations of why each one is bad and options on how to make it cleaner. This collection of smells with explanations makes the book invaluable as a reference. Clean Code has made me a better programmer and with repeated checking of it's list of smells, it will continue to do so. It has also driven up my enthusiasm for writing really good code, so I'd definitely recommend it to anyway who spends much time programming. If you're a programmer by profession, then your colleagues will definitely also be glad that you've read it.

Link to buy the book on Amazon.com

Tuesday, 11 October 2011

Object-Oriented Software Construction Review

This post is a review of the book Object-Oriented Software Construction by Bertrand Meyer.

Where to start with this massive tome? Weighing in with other 1200 pages and some pretty deep content, OOSC isn't a quick and easy read. The author bio describes Bertrand Meyer as equally at ease in the software industry and in the world of academic computer science" and his book also straddles the 2 disciplines. There's plenty of interesting material for academics and those interested in the theory, yet the book also contains much practical, useful information which software engineers can use in their day to day work.

Usefully the book uses a new language written by Meyer and others to demonstrate the concepts explained in the book, which makes them much clearer and allows Meyer to show the concepts in their purest form. The book comes with a CD with an environment for using this language (and an electronic version of the book), so you can play around with it to check your understanding of the concepts.

There are some points of controversy in the book regarding which direction is more object-oriented and Meyer acknowledges opposing choices to his and backs up his choices with well though out arguments. By choosing a strict object-oriented viewpoint in the main part of the book, you miss out on details that would be usable in a more common language than the one presented. Mayer addresses this with a section towards the end of the book about object-orientation in different languages, namely Ada, Simula, Smalltalk, C++ and Java as well as extensions for LISP and C. Though as you can see from the list of languages, the books suffers a little here from being originally published in 1994 with the second edition (the version reviewed) published in 2000.

Here are some of the points that I found most interesting or were new to me, there are a lot! (I didn't start taking notes til the 9th [of 36] chapters so there's nothing here from the first 8 chapters):

You want 3 things from a garbage collector:
1. Soundness - Collected objects are unreachable
2. Complete - All unreachable objects are collected
3. Timeliness - known average and upper bound on time from unreachable to collected
Problems in garbage collection occur when passing arguments to functions in other languages.
Inheritance and type parametrisation (genericity) are 2 perpendicular ways to be more generic. Reliability (type safety) and reusability (single language element covering different variants) are conflicting but can be solved by type parametrisation. Static typing gives you errors at compile time as opposed to at run time with dynamic typing and because the earlier an error is detected the cheaper it is to fix, this is an advantage of static typing (though does this rule really apply here?). Because of their unspeficiedness, generic parameters in functions can only be used in assignments, equals and calls to other functions that take generic parameters.
For a component to be considered reliable, it must perform to it's specification and also handle cleanly, cases outside it's specification.
There are only 2 legitimate responses to an exception: retrying or reporting a failure to the caller. If failing the catch block or equivalent must first restore the object to a steady state.
An overlooked aspect of reusability is that a language should allow you to access code written in a different language. OO is more about the modular organisation of a system that the line to line coding.
From the perspective of types, inheritance is specialisation, from the perspective of a module, inheritance is extension.
Types can be useful when the code is being read by a human for showing intent (though in languages where you do not declare types, this information could be encoded into the variable name).
When architecting a system, classes can be broadly classified into 3 types:
1. Analysis classes - from the problem/domain space
2. Design classes - architectural choice, in solution space
3. Implementation classes - low level, in solution space
Use cases can lead to a sequentially biased analysis and also model existing behaviour rather than coming up with new behaviour. (To counter this Mastering the Requirement Process recommends deciding where the boundaries of the work lie before working out use cases).
Functions should be split into commands and queries, commands should not return a result and queries should not make change that are visible to clients of the class.

Towards the end of the book in the chapters covering concurrency and databases, the amount of general information starts to tail off as the text becomes quite specific to the solution used in the language that accompanies the book.

Overall, this is a very interesting book for fans of the theoretical side of software engineering. Readers who are solely looking for practical tips to improve their code, will find them but they may lose patience due to the massive amount of other information in the book.

Link to buy the book on Amazon.com

Tuesday, 4 October 2011

Mastering the Requirements Process Review

This post is a review of the book "Mastering The Requirement Process" by Suzanne and James Robertson.

The book describes a process for gathering and validating requirements based on the similar processes used and observed by the authors from many software projects. There are many forms and templates in the book to aid the process. The process described does seem a thorough way of gathering and recording requirements though I don't have any experience of other well-documented requirements processes to compare it to. On a first flick through the book, some agile enthusiasts may be put off with the large amount of forms described and process description but the authors very sensibly describe in each chapter, how they would adapt the chapter's contents depending on the agility of the team using the process and as the authors repeatedly state, regardless of how agile you are, if you don't find the correct requirements you won't be making the right product.

Here are some of the points that I found most interesting or were more abstract than the majority of the book:

The book defines 3 types of requirements; functional - things the software must do; non-functional - qualities it must have; and constraints - global issues that shape what can and can't be done. It's a useful mini checklist to use for inspiration, as the initial excitement of a project is normally about the functional requirements, so a reminder that you need to plan for other things is welcome.
Another repeated point of the book is that all requirements should be testable. Obviously some requirements are going to be easier to test than others but the authors aren't afraid to suggest such ideas as testing whether 90% of a panel of potential users can complete a simple task in 10 minutes, as a way of quantifying usability requirements which are the normally the hardest requirements to turn into measurable goals.
Requirements can be reused between projects. As with code, it is suggested that there will have to be some adaptation but you can imagine that reuse of usability requirements would be very possible in products that target the same group of users for example.
You need to strip away the current technology to get to the essence of the work. This often leads to products which are much simpler to use than their predecessors and can lead to innovative products which do the work in a very different way.
Some useful suggestions are made for finding potentially missing requirements. Firstly a requirements to cover what happens when expected external events do not happen. Secondly, for every data class (obviously, I mean class in a high level sense) in the application, if there are requirements that describe reading, writing of updating it, then to check that there is a requirement covering creating of instances of that class.

In conclusion, I would say that the book is most useful to have as a reference rather than for explicitly learning too many high level concepts. The checklists and forms in the book provide a very thorough framework for capturing requirements. I can see the book being very useful for people working on their own projects, where it will help make you consider the product from other perspectives, and on the other extreme, also useful for people working in large organisations who need a strict framework and process for gathering requirements for large products. I am planing on using the process described in the book for a personal project, so I'll update this review when I have some feedback on how that worked.

Link to buy the book on Amazon.com