Literate programming is a technique pioneered by Donald Knuth whereby one attempts to write code in a readable style. Knuth accomplished this using macros, as can be seen in Tex the Program, to abstract away a lot of the code and interweave it with a narrative describing the program.
This style of programming has heavily influenced many prominent coders. For example, aspects of literate programming inform Bob Martin’s Clean Code, in which the author advocates a headline/body copy approach to functions/methods.
Taking another line, Behaviour-Driven Development (BDD) uses human-readable formats such as Gherkin to describe behaviours and then writes code to implement tests for these descriptions. It could certainly be asserted that this appears to have been influenced by Knuth’s literate programming.
The technique can be seen at a language level too; CoffeeScript has a literate mode that allows you to write code as Markdown documents.
However, despite the clear influence of literate programming on today’s coding techniques, it has not been widely adopted in its fully-fledged form.
Docco is a NodeJS tool advertised as a “quick-and-dirty documentation generator”. I decided to try it out to see how the tool relates to literate programming and whether it can help encourage better quality code.
To explore the potential benefits of this approach, I decided to see if I could update a small tool I wrote to use literate documentation.
The tool I created, called “jScene”, is a sandbox environment that Box UK uses when going into schools to teach our coding Masterclasses for Education. The idea is that it sets the learners a problem to solve, suggesting some code that nearly fixes the issue but that leaves them needing to tweak the parameters and see what happens to resolve it entirely. It’s intended as a gentle introduction to programming where developers can rotate around helping the learners and, as a fairly simple system with a relatively small amount of code, I thought it would be a good case study to try Docco out on.
Docco produces a side-by-side view of the code alongside its documentation. This has the benefit of keeping the code clean and also providing a linear description of every stage of the program. This is different to API-style documentation, such as JavaDoc, which produces more “exploratory” documentation rather than a linear narrative.
If you have docblock-style comments, Docco retains them in the source code. The docblock is considered part of the code and isn’t automatically included in the literate documentation.
At first I wasn’t satisfied with that, but on reflection, they serve different purposes; API documentation is for autocomplete and “Intellisense”-style tools. Because I wanted to retain API documentation for autocompletion in Netbeans (see figure 3) I kept the API-style comments, which I wrote using the YUIDoc style. While this can make the Docco documents perhaps less beautiful, and effectively duplicates documentation to an extent, it does create a clear distinction as to what the documentation types are for and, as in this case, the two can be complementary. It seems that the literate approach works well for describing the flow of a program. API-style documentation is more useful for providing autocompletion and lookup-style documentation.
I do not consider that code is complete without documentation. I consider a runbook (usually a README.md) to be absolutely essential and I generally use docblock-style annotations to mark up my code. I have always tried to write readable code, but should I take another step and work in a literate manner?
I found that the type of literate programming encouraged by Docco does lend itself to a different way of thinking. I was considering more how people would read the code, rather than simply how the computer would deal with it and how I could communicate the API. While I always consider readability, I don’t think I’ve ever gone this far before. I think that jScene could indeed be refactored to “read better” and make a better “story”, although this would take significant time.
I believe that this is generally understood of literate programming – that it does take longer. It is a significant overhead to add this level of detail in documentation and as such I am unlikely to use it on everything I do, but for a tool that a lot of people might want to understand, particularly a learning tool like jScene, this was a very strong way of documenting the code.
I think that for “flowing” code that is fairly linear, literate documentation is the perfect style. For object-oriented programs, however, which rely largely on message passing, or systems with large amounts of emergent behaviour, it may simply be too confusing to try to impose any kind of narrative. Nevertheless, even if you don’t go the whole hog with a narrative code structure, having documentation alongside the code in this manner can add a lot of clarity, and also prevent the docs from obscuring the actual code.
Whatever its strengths, though, there is no escaping the fact that literate programming is a big maintenance headache. There is simply more to maintain!
Another problem, specific to Docco, is the packaging of Node.js (which is a dependency for Docco). Ubuntu’s packaged version of Node.js is heck of old at the time of writing, so on our #dev IRC channel, I asked about the best way to install Node.js:
13:33 <matt> gavd node.js installation is a problem. Either that ppa or build from source. For dev I’d build from source.
13:38 <gavd> matt: thanks. As it’s local-only and I’m only using some tools it’s not an issue right now, but clearly if we wanted to CI/deploy it would be a pain.
This is not a fault with Docco but is a limitation of Node.js – until stable packages for Node on reliable repositories start coming through, I’d be hesitant to rely too much on it.
Overall, I felt that this worked very well for jScene, as it’s something that enthusiastic beginner programmers might wish to look at in detail. The literate style lends itself incredibly well to this type of project. Docco is a nice, simple tool, but it’s important to remember that ultimately literate programming is far more reliant on the coder than it is on the tool.
Please leave your comments, or any questions you may have, below.