The Art of Refactoring I: Evaluating The Code Blob
Have you ever come across some code that you knew seemed…wrong, but weren’t really sure of a strategic way to fix it?
Perhaps you’ve heard of the term refactoring, but where did that get you?
What does refactoring actually mean?
Cleaning up messy code?
Improving organization and architecture?
Perhaps just a vague: making da code better?
As devs, when we hear a term or phrase often enough (e.g., dependency injection, thin controllers, refactoring, and unit testing), we know that it’s probably important, but we don’t always know why or how best to apply it.
Can we formalize refactoring into a clear process that we can apply to messy code, with reasonable assurance that it will come out better on the other end?
The Art of Refactoring Series
This article starts the beginning of a three-part series that walks through a process of refactoring a code mess into something you could be proud to check-in.
We won’t just be looking at how to clean up the code, but also the why.
To become better developers, we need to understand why something could be a problem before we can fix it.
It’s a code-sense muscle that we can strengthen.
And hey, that’s the whole point of this blog and newsletter after all.
In this first part of the series we will look at a theoretical set of business requirements for an imaginary car insurance company and take a peek at the blob of code that came out of its implementation.
Then, before actually fixing said code blob, we will discuss a subset of common pain points you can look for, that will provide us with the tools needed to begin cleaning things up.
In Part 2, we will apply the tools we learned from Part 1 to the code blob to establish the first step in our refactoring process: refactoring for clarity.
We’ll play doctor with it, taking in the entire context of the application, our customer’s needs, the likelihood of change, and think about how we can turn it from a blob into a beauty.
Finally, in Part 3, we will take our freshly clarified code and do a final analysis to determine if we need to go one step further in our refactoring process: refactoring for abstraction.
Some Business Requirements
For our purposes, you can imagine we are working on the software that is running behind a car insurance website like GEICO or Progressive.
In this app, a user will be logging in to make payments on their insurance premium. They can either pay with credit card or with an ACH checking or savings account. They also have the option to pay in person with cash.
As payments are entered into the system, it was decided that the car insurance company would need to charge a convenience fee to cover the cost of using the various credit card and ACH processors.
Below is a list of requirements that were given to the development team over time.
As is what normally happens in the art of software engineering, requirements aren’t always known in full at the start.
Processes change, edge cases are forgotten, solutions in themselves can cause additional problems.
So to simulate that, imagine that each bullet point is a new requirement that was given several months after the previous one.
Requirements List
- All credit card payments should include a flat $3.00 convenience fee.
- All ACH payments should include a fee equal to 10% of the payment amount.
- Cash payments should not include a fee.
- We’d like to have smaller fee amounts for payments that are automatically generated by users who set up the monthly auto-pay feature. We think smaller fees will help to incentivize that feature.
- We would like the ability to override the fee with a different amount for certain users who are NOT using the auto-pay feature to reward them on occasion for different reasons.
- It was found that each U.S. State has a law for what the fee can be for credit card and ACH in that State.
- We would like credit card payments to use a tiered model so that payment amounts below a threshold use a flat fee and above the threshold use a percentage of the amount as the fee.
- We would now like the fee override mechanism to also override the auto-pay fee rather than only manual payments.
A Code Blob Emerges…
As you can see, the feature started out quite simple, with nothing but hardcoded fee values based on the method of payment.
Over time, however, the complexity has grown, as it tends to do.
The developers did not take the time to write code with care.
Because of this, the fee calculation code is increasingly difficult to change.
This is a problem because the company expects continued expansion of the convenience fee requirements.
Let’s look at the code…
But… This wouldn’t happen to me!
As this feature grew in complexity, you may be able to tell that the enhancements were thrown into this method in a very single-pointed way.
The developer knew what the request was, found where it should go, and added it.
They didn’t take much time considering how to best fit the change into the existing code. You can even see comments left from earlier requirements that no longer apply.
In an ideal world, we try to leave code better than we found it.
But in the real world, timelines, differences in developer skill sets, and blatant laziness can easily result in the kind of code you see above.
But it gets the job done, and isn’t that all that really matters?
Let’s answer that question now.
Refactoring Phase 1: Evaluation (is it good enough?)
Now that we have the code in front of us, we can begin to evaluate it. Before we actually tear it apart and create something new, let’s go over a series of questions that will become the basis for how we decide what to change.
This is certainly not an exhaustive list, but it will cover a large portion of common problem points in code.
Ideally, these questions become engrained into the very fabric of your being so you no longer have to think about them. Your code takes them into account before your keystrokes reach the editor.
Is it likely to change?
For this question, you’ll likely need to consult with your users or business analysts. Can they anticipate that additional changes to the convenience fee logic will be needed in the future?
If the answer is a definite no, you could then decide that, based on other priorities, no refactoring of this code is necessary!
Some of you might be saying:
WHAT? BUT THAT CODE IS SO FUGLY, BURN IT TO THE GROUND YOU FOOL!.
While it’s true that there is certainly work to be done here, evaluating how some code should be changed always must include the option to just leave it the hell alone.
That is the business of software.
A fine balancing act, weighing ideals of code so clean it makes Uncle Bob shed tears of beauty vs. just gettin’ that shit out the door so customers can actually use it.
Sometimes, when code is battle-tested through years of usage, and it will likely never change, you can make the decision to leave it alone. No need to risk breaking it by cleaning it up when you may never have to touch it again.
In our case, however, the business tells us that they do anticipate a lot of changes to this calculation in the future. Because that is the case, it then becomes a much bigger priority to make sure this code can handle the changes gracefully.
Knowing that, will a change be easy for another developer who may not have seen this code before?
Further, is the code unit tested, protecting the behavior of the class so that changes can be made with some security?
Start with readability
Once we’ve decided this code will likely change, the next step is to begin looking at how we can make our lives easier when those changes become required.
To make future changes easier, we start with readability.
If code is difficult to read and understand, modifying it will be risky and potentially costly!
What are some practical questions we can ask that reveal readability trouble spots?
Is the intent of classes, methods, and variables clear by their names?
Naming things in code is an art in itself.
Names are the way by which our code can be self-documenting. It can read like fine prose, clearly explaining its intent, or it can be utterly vague, requiring developers to scan it multiple times to figure out what is going on.
Good names allow us to reason about the code so that we can change it without breaking it.
Are method argument and variable names written with nouns that represent their intent?
Looking at the code above, is it obvious what everything does?
When it comes to names, I say no shame in using as many characters as needed to describe what something represents. In Part 2 we’ll make some changes that will show you what I mean.
Do methods use a verb + noun combo to describe the action being performed?
I take the Uncle Bob stance on method names in that they should never be nouns. Methods are by their nature an act of processing. Getting, setting, calculating, sorting, etc.
I’d add to that and say it can often help to put what the action is acting on if it is not clear by the method arguments.
This Calculate
method might be clearer as CalculateFeeForPayment
, or perhaps simply GetConvenienceFeeForPayment
since there is not much calculating going on behind the scenes. In the same vein, the class name itself might need improvement. Using the ForPayment
suffix, fluently allows the reader to assume the method arguments represent payment details.
Are there any magic numbers that could be replaced with enumerations or constants?
A developer reading the code above would have to assume by the comments and behavior that method == 2
represents credit cards. Clients of the class would have to know that to.
By using enumerations or constants we can clarify the meaning of the number to save on the mental processing cycles of our future developers.
Are method arguments easy on the client?
Ideally, methods are easy to understand and use based on their name and parameters (the method’s signature).
Does the method need too much from its clients?
Do the types of the arguments force dependencies on the clients? The client’s of our calculate method are responsible for coming up with a full User
object to pass in. Don’t be a dependency slob!
Is that a reasonable requirement?
What if we want to reuse this method in other areas of the code that can’t easily create User
objects?
Is there some way we can shorten the number of parameters required? The more parameters a method has, the more mental RAM the person reading the code must use to understand and reason about it.
Can you see duplicated logic that could be consolidated?
Code duplication is a problem because any time you need to change some logic in the duplicated area, you have to change it n times, where n is the number of duplications.
Imagine if this application had several different locations in the UI that would allow a user to make a payment. If this logic was written again for each area, you would have to remember to change each area whenever the convenience fee code needed a change.
It happens… Trust me.
Imagine if a different developer wrote it each time!
In this case, we fortunately only have one static class that calculates the fee, but we still have a lot of duplication within it.
Are the methods too big?
The infamous single responsibility principle says that classes and methods ideally should only do one specific thing.
They should only have one reason to change.
Further, they should only handle one level of abstraction. We’ll discuss what that means more in Part 2.
Just by the fact that we need more than one sentence to describe this code, tells us that we are doing more than one thing, so this method is too big.
Breaking it down into smaller methods is valuable because it is easier for a human to understand what it is doing, which will make future changes easier and less likely to break the system.
Looking at this code, can you see how it is comprised of many sub-tasks, all lumped into a single method?
The Art of Refactoring Part 2: Refactoring For Clarity
Now that we’ve covered a set of questions to ask, let’s apply them to the code blob and actually begin cleaning it up.
In Part 2 of the series, we will be refactoring for clarity, the second phase of our three-phase refactoring process.