The Single Responsibility Principle
Finally a definition for mere mortals - Radical Object-Orientation 101
There is no question about how highly the Single Responsibility Principle (SRP) is valued among clean code developers. It’s the first of the SOLID principles for a reason.
However… I have yet to meet a developer who’s able to succinctly explain what the SRP actually means. Especially hard to define seems its core term responsibility.
There is not lack in developers trying hard to apply the SRP. What to what use? Without a crystal clear definition the results will be at least mixed.
Unfortunately Robert C. Martin is not really a help here. I’m sure he knows exactly what he meant when he wrote in the same blog post:
The Single Responsibility Principle (SRP) states that each software module should have one and only one reason to change.
[…]
Gather together the things that change for the same reasons. Separate those things that change for different reasons.
[…]
[R]emember that the reasons for change are people. It is people who request changes.
And four years later on Twitter:
In order for a function to do “one thing” it must be so small that no meaningful function can be extracted from it. Any function from which another can be extracted clearly does more than “one thing”.
I’m really willing to understand. But “one thing” is still eludes me. Likewise “the one reason to change”.
It’s the decisions, stupid!
Only after banging my head to the wall for days and then switching to week long mediation retreats in remote Tibetan caves I found enlightenment! It’s so, so simple. And Robert C. Martin was so close in his blog post.
Indeed, the SRP is properly or at least succinctly named. Because it’s really about responsibilities. But responsibilities for what?
What the principle is asking for is that code is taking up responsibility to implement decisions. The people Robert C. Martin is referring to have made decisions. And these decisions can be found in the requirements. Requirements documents are collections of decisions some people — stakeholders — want to see executed by code.
Take these requirements from Robert C. Martin for example:
[Write a] single static function named wrap() that takes two arguments, a string, and a column number. The function returns the string, but with line breaks inserted at just the right places to make sure that no line is longer than the column number. You try to break lines at word boundaries.
How many decisions can you spot in this short paragraph?
Let me try…
The whole software is about changing the layout of a text — instead of, for example, counting the number of lines.
The target text layout revolves around lines of text — instead of, for example, words or letters.
The lines in the target text are separated by “line breaks” (presumable \n characters) — instead, for example, dollar signs or dashes.
The lines in the target text must not be longer than a certain number of characters.
The target lines must not end with a word “ripped apart”. If that would happen the word is not included in the line so that the line will be shorter than the maximum line length.
But what if a word itself is longer than the max. line length? This is not spelled out but can be glean from “You try” in the last sentences: If a word cannot be taken to the next line in full, then the word itself is indeed “ripped apart” to that the first part (head) does not exceed the max. line length. The remaining part (tail) is taken to the next line.
There is no mention of whitespace in the requirements. Does the input text contain spaces? Sure. It’s what separates words. Does it contain \t or \n? That’s not stated. The developer needs to fill in this decision (or go back to the client and ask).
It’s not stated how to deal with the whitespace in the input text. Should all the whitespace between words be retained in the target text? The developer needs to fill in this decision (or go back to the client and ask).
It’s not mentioned, but the target text will be left aligned. Since there are other ways to align text (e.g. right, center, block), this is a decision, albeit an implicit one.
Maybe there are even more decisions hidden in the requirements. But at least these can be easily uncovered — and each needs to be implemented with code. Some piece of code is required to take up the responsibility for each and every decision, be it large or small.
And since people tend to change their minds it’s obvious that each and every decision might be changed in the future. That’s what Robert C. Martin meant.
A new definition for the SRP
The SRP is about decisions made by stakeholders, first and foremost the customer, but also the developer. But since it’s taking the view of the code, it’s not talking about decisions, but about responsibilities.
What does that mean for developers? How does the principle guide coding and lead to clean(er) code? A fresh definition is needed, I think. And I’m venturing to provide one:
SRP 2023:
The implementation of every decision found in requirements shall be assigned an identifier.
Or even shorter:
Every responsibility shall be labeled with an identifier.
I find that very simple, very straightforward. All terms are well known. Still, let me break that down for you:
Decision: A statement in the requirements defining a property a software should have. It’s a decision because if the property is X, then someone deliberately decided it to be so, and not Z or Y.
Responsibility: It’s the developers task to write code which implements the decisions found in the requirements. This code then has the responsibility to comply with the decision.
Identifier: An identifier is a name given to a value, a data type, a function or a module (e.g. class, library etc.). The more intention revealing a name, the better.
This definition solves three problems:
It uses clearly defined terms.
It covers/explains what was meant originally:
“one reason to change” always pertained to decisions. If a piece of code is only responsible for the implementation of one decision then there is only one reason for it to change, namely when this decision is revised.
“the reason for change are people” is obvious now because people are making decisions.
Originally the SRP referred to “modules” as in “A module should do one thing, do it well, and do it only.” or “A module should have one and only one reason to change.” But what is a “module” in the first place? And then, why only modules? Because if only whole modules would need to focus on a single responsibility, then it would be hard to encode hierarchical responsibilities.
But with the above new definition that’s not a problem anymore. Because identifiers exist on different levels of abstraction. They can refer to small responsibilities as well as large ones, depending on which code structure is used and named by them.
Example
Even though I find the new definition very crisp it might be even more convincing when applied to an example. So, why not do this with the kata mentioned? There is a list of decisions already. What’s missing are identifiers for them.
Even though it’s a small problem, it’s a standalone problem. That means I’d wrap it into a module. The module level of choice, for me, is a class (decision (1)):
The whole responsibility to implement the decision about a software to change the layout of an input text has now a home. It’s labeled with an identifier. It’s the biggest decision made within the requirements. In fact it’s the decision from which all other decisions spring; the mother of all decisions.
A class is a great home, but cannot be executed. To actually run code which does the job of changing a text’s layout a function is needed. The one function responsible to carry out all decisions made regarding the new layout was already named in the requirements (decision (1)).
And not only the execution responsibility has now a name. Also the main data values have an identifier (e.g. decisions (1) and (4)). The SRP is not limited to logic! Decisions can pertain to logic (functional requirements, even non-functional requirements) or data.
Speaking of data: the requirements mention words (decision (5)). That decision needs to be represented; and also the production of words since there are several decisions involved with that ((7), (8)) and words are not part of the data already present.
Also, since now the original input has been disintegrated, all the decisions regarding the new layout need to be represented by an identifier related to the max. line length as well as the newly extracted words.
So far, no lines are visible (2). Some piece of code needs to take up the responsibility of forming them (assembleLinesFromWords()) and separating them in the target text by \n (3, fuse()). That’s a task within the larger scope of the reformatting decision:
Have all decisions been taken care of? Since the decisions are forming a hierarchy — bigger, wider, more fundamental decisions containing smaller, narrower, more specific decisions — they are naturally implemented with a hierarchy of identifiers given to code elements which can form a mirroring hierarchy.
Wrapper{} - representing decisions (1)..(9)
wrap() - (1)..(9)
splitIntoWords() - (7), (8)
reformat() - (1), (2), (3), (4), (5), (6), (9)
assembleLinesFromWords() - (1), (2), (4), (5), (6), (9)
fuse() - (3)
Identifiers for implementations of decisions on a higher level of the hierarchy of course are associated with more decisions than those on a lower level. But how many decisions are ok? The SRP states: just a single decision per identifier is the goal. That means, all leaf functions in the above hierarchy with more than one decision attached to them need further refinement.
Obviously splitIntoWords() and assembleLinesFromWords() hence need a closer look. Especially the second. But let’s start with the first. How can the whitespace decisions about the input text be represented?
In this case I chose to not wrap them in yet another function, but just represent them with a variable name. It’s enough to make clear what’s happening in the respective lines.
Yes, the logic is not testable. But it’s so small that I trust it’s ok to just test the encompassing function.
Traditionally splitIntoWords() would not be deemed to follow the SRP. But to me it’s ok to integrate several decisions in that manner if each data identifier represents only a single decision — which they do in this case.
The first one is about decision (7) and clearly states whitespace in the input will henceforth be ignored. The second one is about decision (8) and states that all consecutive non-whitespace characters are considered words. If one of these decisions should change, it’s clear where to adapt the code.
Finally what about creating the lines for the new layout? There are still a lot of decisions to untangle.
The number of decisions and their complexity are too large to represent them just as variable identifiers with some logic attached. Functions are required.
splitLongWords() - (5), (6): This function takes care of words which are too long to fit into a line by themselves. They are split into “syllables”. From then on all parts from which the new lines are be build are less or equal to the max. line length.
assembleSyllablesForLines() - (2), (4): Here the long list of syllables is split into chunks each containing only as many syllables as will fit into a line of the given maximum length (plus at least a single blank between them).
align() - (9): The final function creates text lines from the syllables that should go into them. As long as just left alignment is required the max. line length does not even need to be passed in. But for right/center/block alignment it’s necessary. align() is responsible for putting the right number of blanks between the syllables.
All functions are now very focused. Even though some still represent two decisions I would not further split them into smaller functions. At least not without actually implementing the logic — which I’ll leave to you as an exercise. I’m pretty sure each function won’t exceed just a couple of lines of straightforward logic and will be easy to test if need be.
But where is the implementation of decisions (1) and (2)? assembleLinesFromWords() lists them as its responsibilities and none of the called functions are taking them up.
It’s assembleLinesFromWords() who’s fulfilling them. It does so by integrating the other functions. Not only logic, but integration, too, is a form of composition in order to fulfill a responsibility. See this article for details on that:
Summary
That wasn’t so hard, was it? The SRP is a very, very simple principle — and a very powerful one. At least once you uncover the crucial term decision.
What it then does for you is guide you through requirements analysis; it sends you on a hunt for decisions of all sizes.
Once you have them in a long list, organize them in a hierarchy from large to small.
Then label each one with an intention revealing identifier and assign it a code structure mapping. The responsibility for implementing a decision can be encoded in modules, functions, and even expressions (or data types). All these structures are at your disposal. For me the SRP is a great help for Radical Object-Orientation, i.e. object-orientation how it was envisioned by Alan Kay. It goes nicely hand in hand with the IOSP.
If you’re faced with existing code ask yourself: “Which decisions are encoded by this piece of code?” If it’s more than one than it better be not the leaf of a code structure hierarchy.
The SRP is a great tool for analysis, design, and refactoring.