Why Self-Documenting Code Still Needs Comments

This article was originally published on Medium.

Comments in your code provide context. Repeated research shows that context can support learning and learning is the pathway to understanding. The context is critical to help other developers learn and understand your code — no matter how “self-documenting” you think it is.

There’s a common argument among programming circles. Does self-documenting code mean no comments?

You’ll hear those on the side of the “Yes’’ argument cite Clean Code, Pragmatic Programmer, Clean Architecture, or Clean Craftsmanship. I’ve even recently seen someone so fervently on the no-comment side of the argument say to read those titles, “cover to cover. Then go back and read them again.”

Now, I have to be honest when I say that I haven’t read all of those titles — and I certainly haven’t read them twice. I do, however, have real-world experience with large corporate clients, and I do not agree with the idea that comments should be rejected outright.

Let’s step back and look at the value of self documenting code. What if I were to write the following pseudo code:

float velFinal(float v0, float a, float t) {
return v0 + (a * t)
}

Pretend you have a simple (again, terrible) function like the one written above. The intent behind the code above is to calculate the final velocity of an object. Obviously, the way it’s written above gives you zero context or understanding about the situation and you’d have to go out seeking what this formula is doing.

The reason why self-documenting code is so important is to address these issues. For example, what if I changed the above to something like:

float computeFinalVelocity(float initialVelocity, float acceleration, float timeInSeconds) {
return initialVelocity + (acceleration * timeInSeconds)
}

Now someone who sees this function will know that it computes the final velocity, that it takes acceleration and time, and the time argument is in seconds. This is context that helps us understand. We can establish communication standards that are agnostic of the language in which the function was written. Neat!

I emphasized the word “context” for a reason. This was a painfully simple example, but context and humans go together like spaghetti and meatballs. I really love spaghetti and meatballs.

But what if you have a completely different function? In this scenario, imagine that the client uses ambiguous terminology in their organization? Maybe they have a product called “Acceleration” that is nowhere near the definition from physics. You could/should, of course, name the function something clear and add similar changes we did above to make the code self documenting. Is this method an answer for all situations that may come up with a client?

Imagine that the following happens in a similar scenario:

function computeSomethingImportant(ProductData productData){
…
subOptimalAlgorithm(productData)
…
}

Now, what if you actually knew there was a better way to perform that algorithm, but you knew it would take a team of 3 at least 4 weeks to complete that work and you need to put in your change by tomorrow at lunch or the project manager is going get so upset that there’s another red light on their progress board that they accidentally throw coffee at their dog on camera, again.

The point is that there is no way to write this function only using self-documenting code to give it enough details on why you just put a PR up with such a naïve algorithm in your beautiful and perfect code.

What if you had that same example and you changed it to

function computeSomethingImportant(ProductData productData){
…
//TODO: suboptimal algorithm used due
// to some constraint. Replace with much
// better shimalamablam algorithm
// ASAP to avoid total PM meltdown in September
subOptimalAlgorithm(productData)
…
}

Now you’ve got a nice note that says why you did what you did, what someone can do to fix it, and why it will be good to fix. That’s a tall order to fill with just good variable names.

Of course, this information could be covered in the docs, but we all know that the documentation could stretch from Baltimore to Cincinnati if laid out on the ground.

The comments themselves will help other developers understand the situation by providing extra information. The reason I think you need this extra information in most cases is simple. The person looking at the code is often not the person who wrote it. That’s it.

This elementary idea opens a doorway to a new perspective filled with another person’s background, assumptions, biases, and experience. In other words, it forces us to write more empathetic code. The way we learn can be accelerated by the empathy of others through context.

Context supports learning and learning supports understanding. Then, without context, learning and understanding will be unnecessarily slower. This leads to a financial waste equal to thousands of dollars when you consider a developer’s cost/time.

To help solidify this idea, let’s assume there is a new developer that starts working on the super important function from before. Because of the comment, they can speak about where the optimization will go and roughly what to do fairly quickly. They went from lost to informed in a matter of hours or days instead of weeks or months.

Comments are the catalyst for context, but what is context and how do we know it supports learning and therefore understanding? There are peer reviewed research papers and books that have been cited by hundreds of other researchers that support this claim that understanding the conditions around an environment reinforces learning. One such book, “Context and Learning” edited by P Balsam, A Tomie (2014), which was originally written in 1985 and published again in 2014 by Psychology Press, says this about learning and context in the first chapter:

“Learning occurs in a cognitive or associative context of what has been learned before and in an environmental context that is defined by the location, time, and specific features of the task at hand.”

The book spends its focus on the “specific features of the task at hand”. What this means is that, through research, it has been shown that context around a specific task is key to facilitating the learning process.

Well-placed and well-written comments are teaching while developers are in the middle of learning.

Am I saying that we shouldn’t write code that is self-documenting and easily understood? Absolutely not! We should always be striving to write cleaner code. But to simply dismiss comments because they look messy or get out of date is a dangerous road to go down. It is a road of missed opportunities and a dismissal of how we learn as humans.

Why Self-Documenting Code Still Needs Comments

float velFinal(float v0, float a, float t) {
return v0 + (a * t)
}

float computeFinalVelocity(float initialVelocity, float acceleration, float timeInSeconds) {
return initialVelocity + (acceleration * timeInSeconds)
}

function computeSomethingImportant(ProductData productData){
…
subOptimalAlgorithm(productData)
…
}

function computeSomethingImportant(ProductData productData){
…
//TODO: suboptimal algorithm used due
// to some constraint. Replace with much
// better shimalamablam algorithm
// ASAP to avoid total PM meltdown in September
subOptimalAlgorithm(productData)
…
}

Optimizing Databricks for a Leading Financial Services Company

Understanding the Hidden Costs of Databricks: What You Need to Know

AI for Software Development: Navigating Opportunities and Risks

Why Self-Documenting Code Still Needs Comments

float velFinal(float v0, float a, float t) {return v0 + (a * t)}

float computeFinalVelocity(float initialVelocity, float acceleration, float timeInSeconds) {return initialVelocity + (acceleration * timeInSeconds)}

function computeSomethingImportant(ProductData productData){…subOptimalAlgorithm(productData)…}

function computeSomethingImportant(ProductData productData){…//TODO: suboptimal algorithm used due// to some constraint. Replace with much// better shimalamablam algorithm// ASAP to avoid total PM meltdown in SeptembersubOptimalAlgorithm(productData)…}

Related Insights

Optimizing Databricks for a Leading Financial Services Company

Understanding the Hidden Costs of Databricks: What You Need to Know

AI for Software Development: Navigating Opportunities and Risks

float velFinal(float v0, float a, float t) {
return v0 + (a * t)
}

float computeFinalVelocity(float initialVelocity, float acceleration, float timeInSeconds) {
return initialVelocity + (acceleration * timeInSeconds)
}

function computeSomethingImportant(ProductData productData){
…
subOptimalAlgorithm(productData)
…
}

function computeSomethingImportant(ProductData productData){
…
//TODO: suboptimal algorithm used due
// to some constraint. Replace with much
// better shimalamablam algorithm
// ASAP to avoid total PM meltdown in September
subOptimalAlgorithm(productData)
…
}