Cohesion - Part 2 Refactoring Towards Cohesion
In the second part of this series about code cohesion we’ll look at how we can refactor an existing class to be more cohesive and look at the benefits of doing so.
Previously we were examining a class,
DiscountCalculator, and we determined it to have
a Lack of Cohesion of Methods (LCOM) score of 0.53. We want this score to be as close to zero as possible
so need to refactor this class to improve its cohesiveness. High cohesion is achieved by keeping methods and
the fields they operate upon together in the same class; therefore to improve cohesiveness we have two main
refactoring strategies - extracting methods and extracting fields.
Let’s quickly just review the
We can see that there are three fields and five methods. Doing a quick count we can quickly see which methods are least cohesive.
It’s fairly obvious that the
FormatValue() methods are the least cohesive as they use no
fields at all. In fact, there’s a pretty big warning sign right there in the code; both these methods are marked
static meaning they belong to the class and not to instances of the class, therefore they are unlikely to
access instance fields. Additionally, look at
FormatTaxAmount(), this method really has nothing to do with calculating
discounts and is not called by any of the other methods; it’s probably only here so that it can use the formatting
provided by the
So, what to do?
Well, the most straightforward refactoring we can make is extract those methods into another class and reference
them from the
DiscountCalculator; like so:
And then we can re-calculate our LCOM. We removed two methods, so M is now 3; the other variables remain the same:
1 - (sum(MF) / M * F)
1 - ((2 + 2 + 3) / 3 * 3)
1 - (7 / 9)
1 - 0.22
Much better, we’re much closer to zero. But hang on a second, not so fast, look at the
we just created, it’s entirely static so I’d guess not very cohesive. Let’s do a quick LCOM on that class
M = 2, F = 0, SUM(MF) = 0
1 - (0 / 2 * 0)
MoneyFormatter class is completely uncohesive with the worst possible LCOM score of 1.
Even worse if we calculate the average cohesion of these two classes together we get a score of 0.61 which
is even worse than our original score of 0.53.
Let us see if we can improve
MoneyFormatter. If we look at the two methods they both operate on a single
decimal so maybe we change this to an instance field and improve the cohesion. I am also going to change the
name of the class to
Money to describe its new responsibility.
However, when I try to integrate the
Money class into
DiscountCalculator I find that there are some issues around
calculating the discount. Therefore I need to add two new methods to
DiscountCalculator now looks like this:
Money class is fully cohesive, scoring 0 for LCOM. Our average cohesion is 0.11; this is pretty good and may be good
enough in many scenarios - but in this case I think we can do better.
The second technique we’ll look at to improve cohesion is to extract fields into another class. When we do this we want to move two or more fields and replace them with a single field. Look at your class and try to identify related fields that should be grouped together - such groups of fields are often referred to as Data Clumps.
If we examine the fields in
DiscountCalculator we have
customerName. It seems fairly obvious that
_customerName belong together, probably in some sort of
Customer class. Let’s do that:
And integrate it into
And yay, we’ve finally managed to get an LCOM score of zero for
DiscountCalculator - we have two fields and they are both
used in every method. The LCOM for
Customer is 0.34 and, disappointingly, the average LCOM remains at 0.11. To fix this we
are going to have to move some of the behaviour into the
Customer class. But to start with we’ll look at
which is currently defined as an
The first move is convert this into a real class and move the responsibility for calculating the discount into it:
And update the
GetDiscount() method in
I don’t like the Demeter violation in that method so I add a passthru method
DiscountCalculator to use it:
These changes so far have not improved our LCOM score, they were just necessary to get the code is a better state for my final refactoring.
Looking at the
FormattedTotal property in
DiscountCalculator I can see that this behaviour actually belongs to the
So I move it to the
This now gives our
Customer class an LCOM score of zero, so we are done.
Let’s take a final look at our
We’ve pretty much ended up with a class that does nothing and I’d probably look to remove it and get clients to interact
directly with the
Customer class instead.
So, what have we achieved? By concentrating on improving the cohesion in our original
DiscountCalculator class we have
spawned additional classes for
CustomerStatus to replace the original confused class. In doing so
we have the improved the design by putting the correct responsibilities into highly-focused classes. In other words, we have
fulfilled the Single Responsibility Principle.
One of the most common criticism of the Single Responsibility Principle is that it is not clear how big the single responsibility should be. As an extreme example, I could write my entire application in a single class and it would still only have a single responsibility, that being “running my application”. But by concentrating on the cohesion metrics we can correctly size classes to manage just their data and apply the Single Responsibility Principle as it was meant to be.
Another design improvement we’ve made is to remove the reliance on simple-types; we achieved this by replacing them with classes. So if
you look at the original
DiscountCalculator it used
decimal for the values,
string for the customer and an
enum for the customer
status. In the completed version these have been replaced with
In the final part of this series I will look how we should apply cohesion at the architectural level.