Refactor Your Codebase as You Go, or Lose it to Early Death
Also, Scrub Your Teeth Twice a Day
Refactoring is badly misunderstood by many software professionals, and that misunderstanding causes software teams of all kinds – traditional and agile – to forgo refactoring, which in turn dooms them to waste millions of dollars. This is because failure to refactor software systems continuously as they evolve really is tantamount to a death-sentence for them.
To fail to refactor is to unwittingly allow a system to decay, and unchecked, nearly all non-trivial systems decay to the point where they are no longer extensible or maintainable. This has forced thousands of organizations over the decades to attempt to rewrite their business-critical software systems from scratch.
These rewrites, which have their own chronicles of enormous expense and grave peril, are completely avoidable. Using good automated testing and refactoring practices, it is possible to keep codebases extensible enough throughout their useful lifespans that such complete rewrites are never necessary. But such practices take discipline and skill. And acquiring that discipline and skill requires a strategy, commitment, and courage.
So, First of all: Refactoring – What is It?
The original meaning of the word has been polluted and diluted. Here are some of the “refactoring” definitions floating around:
- Some view it as “gold-plating” – work that adds no business value, and merely serves to stroke the egos of perfectionists who are out of touch with business reality.
- Some view it as “rework” – rewriting things that could, and should, have been written properly in the first place.
- Others look at refactoring as miscellaneous code tidying of the kind that is “nice to have,” but should only happen when the team has some slack-time, and is a luxury we can do without, without any serious consequences. This view would compare refactoring to the kind of endless fire-truck-polishing and pushups that firemen do between fires. Busy work, in other words.
- Still others look at refactoring as a vital, precise way of looking at the daily business of code cleanup, code maintenance, and code extension. They would say that refactoring is something that must be done continuously, to avoid disaster.
Of course, not all of these definitions can be right.
The original, and proper, definition of refactoring is that last one. Here I attempt to explain and justify that. But first let’s talk about where refactoring came from as a practice.
What problem does refactoring try to solve?
The Problem: “Code Debt” and the “Cost of Decay” Curve
What is Code Debt?
Warning: Mixed Metaphors Ahead
Veteran programmers will tell you that from day one, every system is trying to run off the rails, to become a monstrous, tangled behemoth that is increasingly difficult to maintain. Though it can be difficult to accept this unless you have seen it repeatedly firsthand, it is in fact true. No matter how thoughtfully we design up front and try to get it entirely right the first time, no matter how carefully we write tests to protect us as we go, no matter how carefully we try to embrace Simple Design, we inevitably create little messes at the end of each hour, or each day, or each week. There is simply no way to anticipate all the little changes, course corrections, and design experiments that complex systems will undergo in any period.
So enough of dental metaphors for a moment. Software decay is like the sawdust that accumulates in a cabinetmaker’s shop, or the dirty dishes and pots that pile up in a commercial kitchen – such accumulating mess is a kind of opportunity cost. It always happens, and it must be accounted for, planned for, and dealt with, in order to avoid disaster.
Programmers increasingly talk about these little software messes as “code debt” (also called “technical debt“) – debt that must be noted, entered into some kind of local ledger, and eventually paid down, because these little messes, if left unchecked, compound and grow out of control, much like real financial debt.
The Software “Cost of Decay” Curve
Years ago it was discovered that the cost of correcting a defect in software increases exponentially over time. Multiple articles, studies, and white papers have documented this “Cost of Change Curve” since the 1970′s. This curve describes how the cost of change tends to increase as we proceed from one waterfall phase to another. In other words, correcting a problem is cheapest in requirements, more expensive in design, yet more expensive in “coding,” yet more costly in testing, yet more costly in integration and deployment. Scott Ambler discusses this from an agile perspective here, talking about how some claim that agile methods generally flatten this curve. Ron Jeffries contends, alternately, that healthy agile methods like XP don’t flatten this curve, but merely insist on correcting problems at the earliest, cheapest part of it. I agree with Ron, but I claim that’s only part of how agility (and refactoring in particular) helps us with software cost of change.
There is a different (but related) exponential curve I dub the “cost of decay curve.” This curve describes the increasing cost of making any sort of change to the code itself, in any development phase, as the codebase grows more complex and less healthy. As it decays, in other words.
Whether you are adding new functionality, or fixing bugs, or optimizing performance, or whatever, the cost of making changes to your system starts out cheap in release 1, and tends to grow along a scary curve during future releases, if decay goes unrepaired. In release 10, any change you plan to make to your BigBallofMud system is more expensive than it was in release 1. In the graph-like image below, the red line shows how the cost of adding a feature to a system grows from release to release as its decay grows.
The number of releases shown here is arbitrary and illustrative — your mileage will vary. Once more, I am not talking about how, within a project, the cost of detecting and fixing a problem increases inevitably over time, as the Cost of Change curve does. I am saying that we can use the cost of any sort of change (like adding a new feature) to measure how much our increasing decay is costing us. I am using the cost of a change to measure increasing cost of decay.
Back to the dental metaphor. If, in the last few minutes of programming, I just created a tiny inevitable mess by writing 20 lines of code to get a test to pass, and if that mess will inevitably ramify and compound if left uncorrected (as is usually true), then from the organization’s perspective, the cheapest time for the organization to pay me to clean up that mess is immediately – the moment after I created it. I have reduced future change costs by removing the decay. I have scrubbed my teeth, removing the little vermin that tend to eat, multiply, defecate, and die there (I never promised a pleasant return to the metaphor — teeth are, let’s face it, gross).
Again, if a day’s worth of programming, or a week’s worth of programming, caused uncorrected, unrefactored messes to accumulate, the same logic is imposed upon us by the cost of decay curve. The sooner we deal with the messes, the lower the cost of that cleaning effort. It’s really no different than any other “pay a bit now or pay a lot later” practice from our work lives or personal lives. We really ought to scrub our teeth.
Little software messes really are as inevitable as morning breath, from a programmer’s perspective. And nearly all little software messes do ramify, compound, and grow out of control, as the system continues to grow and change. Our need to clean up the mess never vanishes – it just grows larger and larger the longer we put it off, continuously slowing us down and costing us money. But before we talk about how these little messes grow huge, helping to give that cost of decay curve it’s dramatic shape, let’s talk about the worst-case scenario: the BigBallOfMud, and the Complete System Rewrite.
Worst-Case Scenario: The BigBallOfMud, and the Complete Rewrite
Most veteran programmers, whether working in procedural or object oriented languages, have encountered the so-called BigBallOfMud pattern. The characteristics of this pattern are what make the worst legacy code so difficult or impossible to work with. These are codebases in which decay has made the cost of any change very expensive. At one shop at which I once consulted, morale was very low. Everybody seemed to be in the debugger all the time, wrestling with the local legacy BigBallOfMud. When I asked one of them how low morale had sunk, he said something like “You would need to dig a trench to find it.”
With a bad enough BigBallOfMud, the cost of the decay can be so high that the cost of adding the next handful of features is roughly the same as the cost of rewriting the system from scratch. This is a dreadfully expensive and dangerous outcome for any codebase that still retains significant business value. Total system rewrites often blow budgets, teams and careers – unplanned-for resources must be found somewhere for such huge and risky efforts. Below we revisit the cost of decay curve, adding in a blue line showing how we strive to increase our development capacity from release to release. At best, we can achieve this growth linearly, not exponentially.
At the point where the two lines cross, we have our BigBallOfMud. We are out of luck for this particular system – it is no longer possible to add enough resources to maintain or extend it, nor shall it ever be again. Indeed, the cost of decay, and the cost of making any sort of change, can only continue to increase from there, until it becomes essentially infinite – change cannot be made safely or quickly enough at all.
We are then faced with a total system rewrite, because we have lost all of our refactoring opportunity, along with our ability to make any other forms of change. How many expensive, perilous total system rewrites have you seen or taken part in, in your career? How many “legacy codebases” do you know of that just could not be maintained any longer, and which had to be replaced, at great expense, by a rewrite, perhaps in a new technology or with a new approach, perhaps by a completely new team? I have personally seen several over the years. They have not all gone well.
The Nasty Details
So what makes inextensible code inextensible? What is so nasty about a BigBallOfMud?
A BigBallOfMud fights programmers at every turn. Changes and extensions that were once easy in the codebase have become very, very difficult.
The typical worst-case BigBallOfMud, in the world of Object Oriented languages, has the following kinds of characteristics (for most of this list, I am indebted to “Uncle” Bob Martin, and his book Agile Software Development: Principles, Patterns, and Practices):
- Rampant code duplication (some of it explicit, some implicit), of the kind that requires “shotgun surgery” in order to make changes or extensions. A single change to behavior, instead of being made in one, easy to find location, must be made in several locations that are perhaps not all easy to locate.
- High coupling — lots of modules possessing “hard” dependencies on one another, like a tangled ball of string. This too fights changes and extensions, since a change to one module necessitates changes to many other modules that are unfortunately “along for the ride.”
- Low cohesion, or poor “separation of concerns” — violations of the so-called Single Responsibility Principle. Again, this makes change points hard to find in order to make fixes or extensions.
- Promiscuous sharing of global data — less and less use of the discrete properties and behavior of true objects in the object model, and more and more use of big, static procedural classes with names like “Manager,” “Processor,” “Utility,” “Helper,” “Director.”
- “Opacity” — the code is hard to read, hard to understand, and therefore hard to work with, since we cannot always be sure we are changing the right thing in the right place. This results partly from poor naming standards and practices.
- Very large modules — classes with too many methods, methods with too many blocks of code. Classes with 5000 lines and 100 methods slow you down in the same way a 100-page contract slows you down. It’s not easy enough to find what you need, and once you find it, it’s not easy to change it.
- Increasingly difficult-to-follow flow of control. Cyclomatic complexity is a useful measure here. You just cannot tell very easily, in a complex enough module, how the algorithm works, how the work is getting done, and why it is bothering. This leads to lots of wasteful head-scratching.
- Viscosity — this is the overall tendency of the codebase to encourage programmers to expediently make things worse. In other words, it is much cheaper and faster to make things worse in a BigBallofMud than it is to make things even a little bit better. You follow the path of least resistance, which creates more resistance for the next programmer. (For more info on this, see the Broken Window Theory section below.)
Recap: Decay = Bad; Rewrite = Bad
Let’s repeat some key points here:
- Complete system rewrites are enormously expensive and dangerous;
- It takes awhile to discover that a system actually requires a complete rewrite, and during that time, the cost of decay curve is working it’s evil on your system and your team, making every sort of change more expensive;
- Defeating the nasty cost of decay curve takes will, discipline, specific practices, lots of learning, and great skill.
OK, So How and Why Does this Happen?
Aren’t These Messes Avoidable? Why not Just Do it Right The First Time!?
It’s reasonable for non-programmers to ask why programmers don’t just write the code properly in the first place. Why create these messes? Why not just work cleanly and simply in the first place? If you are balancing up-front-design with continuous design, if you are working test-first, if you are avoiding speculative design, then why in the world do these messes happen?
It’s a good question. There are several parts to the answer.
Part One: Extensibility is Where it’s At
In software this inability to anticipate all of a system’s long term needs is why we choose to work Test First, and to embrace Simple Design in the first place: because we cannot anticipate every small and large demand that will be placed on us and our system during it’s entire useful life. We use Simple Design to be ready for anything and everything. Our exhaustive unit tests, our simple, clear design – these things protect us from unanticipatable change. They enable us to turn on a dime no matter what business requirements arise. They help us flatten the cost of decay curve, by preventing most of the decay.
And it turns out (as we’ll summarize below) that Continuous Refactoring is a big part of how we do that. Refactoring is the mechanism we use to keep small messes from getting out of hand, and the mechanism we use for introducing extensions and changes in a clean, clear, extensible way.
Part Two: The ToC and Eliminating Bottlenecks
The Theory of Constraints (ToC) talks about finding and eliminating bottlenecks. It’s easy, at first, to think of refactoring itself as a bottleneck – a place where we are “touching the product more than once” where once really should suffice. By this logic, if we eliminate refactoring, we are improving efficiency by eliminating a bottleneck.
But in fact, that’s the reverse of the case. Refactoring is not a bottleneck, but our best means of finding and eliminating bottlenecks!
To see this a bit more clearly, you need to think about software using a slightly better metaphor.
Though many have argued quite well that a mechanistic, manufacturing or engineering metaphor works terribly for software, there is one way in which a factory metaphor works pretty well for us here.
Indeed we should not have to implement a given feature in a software system twice – that would indicate that we had misunderstood something vital in the requirements, or something vital in the design, in the first place. That would indeed count as rework and waste. So using a ToC model, these features are our products for which we want most efficient production flow, and least amount of touch.
But the software system we are working on is not the products (features) themselves, it is our factory for those products.
Suddenly refactoring looks much more sensible. In order to continue to produce many different kinds of products across many different releases, we need to keep our factory as lean and clean as possible. The factory will need continuous tweaks and adjustments. Sometimes, when a requirement comes along for an entire new product line (a completely new kind of feature), or factory will need an entire new production line that will have to be integrated as cleanly as possible into the rest of the factory.
Refactoring, in this ToC factory metaphor, is the process of keeping the factory clean, lean, efficient, and ready to quickly build anything we have already built at least once.
Part Three: The Learning Curve: Cleaning the Messes of Junior Programmers
It takes enormous skill to pursue all of the software practices mentioned along the way here: balancing up-front design with continuous design takes lots of hard-won skill. Working test-first, writing really good tests, takes more skill than that. Simple Design and avoiding speculative design takes even more skill. Here is a hard truth: Not everyone on your team will always have enough of these skills. I mean this non-judgmentally. Many of us have been junior programmers (some of us longer than others!).
Some of your programmers will have these skills, and will help flatten the cost of decay curve for you. But some of your programmers, especially junior staff who do not know agile practices, will unintentionally and unconsciously create small messes as they code.
But second of all, even your very best, senior-most programmers, if they do not create such little messes as they write new code, will often be required to create such little messes as they change existing code. At the point when we change existing code to handle some unanticipated (and valid!) new business requirement, we often find that we are faced with a fundamental choice:
1. Patch in the new code in an expedient, but ugly way (for example, using a bit of code duplication), in order to meet today’s deadline (recognizing that we are now incurring code debt that we must eventually pay down);
2. Refactor the existing design so that the new code and old code can be interwoven cleanly, clearly, and extensibility. This almost always takes more time to do, and we do not always have that time in the heat of pressing deadlines.
“Hey! Again, Just Plan it and Do it Right the First Time!”
Once more I can hear someone asking, “Well, why not just build time into your estimates to do all of this right? To clean everything as you go? Why is this such a surprise to veteran programmers?”
And within the boundaries of say, a single system release, solid, veteran programmers can anticipate a lot about the requirements they will face. They can indeed bake refactoring time into their work estimates. They can keep the system darned clean, if they can keep up with the messes made by the junior staff.
But a useful system lives for many releases. If you are working on release 2, how much can you know about what will be in release 10? It turns out that statistically, you can know nearly nothing about what release 10 will contain, what business problems it will address, and what kinds of unanticipated architectural and design pressures will then be imposed on your code.
It’s like asking, what needs will the family who lives in your house 10 owners from now need from the house? Will it be big enough for them? Small enough? Energy efficient enough? What changes will they need to make? Will the neighborhood be safe enough? There is no way for you or them (if they exist yet!) to anticipate all of that. They may indeed look at design choices the original builder and owner made and think to themselves “What were they thinking!”, but usually the original owner and builder had needs and a budget and skills that constrained what they could accomplish.
Part Four: The Broken Window Theory
If a software team keeps making choice 1 of our two choices above – preferring expedience to extensibility – they will eventually end up with a BigBallOfMud. When a team of programmers allows a system to deteriorate past the point where they all know it is no longer extensible and maintainable, the rate of deterioration begins to increase dramatically. This is code viscosity. The unconscious reasoning goes something like this: “If so and so programmer was OK doing this sloppy patch over in that part of the code to meet a deadline, then I guess it’s OK for me to do the same thing over here.”
There is a great theory discovered in the world of urban decay that describes the psychology of how small messes become large ones. For an excellent overview of the Broken Window Theory as it relates to software, see this article.
Part Five: The Gardening Metaphor
As far as it goes, the factory metaphor for software development can be useful. But as I said earlier, several thoughtleaders have argued passionately against it.
Software is best understood as a “touch always” medium, with no perfect analog elsewhere in the world. If you treat software like hardware, using a “touch once” mentality, then your software responds by becoming hard — in fact, too hard to work with. Both the stuff itself and your processes then become mechanistic in a way that is inherently harmful to software’s true nature, as they are equally harmful to the inherent softness of human programmers, human BAs, human PMs. Mechanistic, manufacturing, hardware metaphors, like the systems themselves, often have a reductionist, dehumanizing effect. They boil us down to less than we are, and they certainly boil software down to less than it can be.
The essence of healthy software is its softness — its malleability, its workability, its extensibility. It is clay that NEVER hardens, and must never. We are always working it, always expected to be able to transform it alchemically from whatever shape it currently inhabits to any other, arbitrary shape.
Thus the essence of consummate programming craftsmanship is the ability to work software and keep it workable. The ability to change it, and keep it receptive to change. Test-first and refactoring, together, symbiotically, are the magic formula for maintaining this near-arbitrary softness. No other combination works.
In this way, software is much more like gardening. This is the metaphor that my favorite agilists like Kent Beck prefer to use. It’s not a perfect analog, but it’s closer than manufacturing.
Software and software teams need continual, almost loving attention, like gardens. Weeds seem to spring from nowhere. As we work with software with consummate skill, we find the weeds and remove them. This helps us to see the best designs emerge — they seem to blossom organically in way that is almost beyond us, like any good art form. As we find our groove, our flow, in the code, we can feel what is best for today’s requirements, best for the business, best for the code itself. This completely transcends mechanistic metaphors and thinking models. The best programmers need this creative expression that is perfectly aligned with the highest velocity and the highest ROI.
Bottom Line: Refactoring Manages the Cost of Decay
Let’s revisit the dreaded cost of decay curve once more, this time for a system that is continuously refactored. What does this mean? Continuous Refactoring includes the following specific practices:
- The entire team knows and uses Fowler’s Refactoring pattern language (see more below).
- Every few minutes individual programmers working test-first look for opportunities to clean up small messes.
- Whenever a programmer finds a big mess lurking in the code, they tag that code or otherwise add it to the list of messes that must be dealt with – the ledger of the system’s code debt.
- Every day the team, at daily standup, identifies any areas of the code that are beginning to become intractable.
- At each iteration, the team or its planners determine how and exactly where to deploy resources to clean up messes that threaten productivity.
- Every time a large new feature or set of features arises in requirements, the team looks at how the existing system would optimally be refactored to accept this new “axis of change,” in order to prevent the messes that are created by just hacking new features into place. This is where we “add a new production line” to “the factory” that is not yet there, without disturbing the rest of the factory.
If the team follows these steps rigorously, then the system can end up with a very different cost of decay curve, the green curve below.
Notice a couple of things about that green line. The cost of our system, and the rate of its cost increase, are both actually a bit higher at first, because we are incurring more upfront costs by writing tests as we go, and by refactoring as we go.
But notice that as we rigorously keep the system test-protected, simple, clear, and clean, our costs start to flatten out, to the extent that we can easily keep up with them: our blue resource cost line can track our cost increases straightforwardly: at no point do we have that sharp, runaway cost increase. This is just reaping the reward of a smart upfront investment, like any other – it’s like reaping the benefits of lifestyle choices like getting good exercise, eating well, and getting regular checkups. In nearly all of life, whenever we know we must eventually pay, it’s cheaper to pay now, rather than wait to pay later.
Again, the long-term consequences of a flattened cost of decay curve are huge: we are never faced with the horrible prospect of that expensive, perilous system rewrite. We don’t have to pay later.
What Continuous Refactoring is Not
- Refactoring is not Gold-Plating or Over-Engineering. Programmers sometimes are tempted to fiddle endlessly with a system, making it “cooler,” or “more elegant” than it truly needs to be. Or they sometimes introduce more abstraction layers or design patterns or whatever than today’s business requirements actually require. I am speaking as a programmer who has been guilty of these crimes in the past. All Gold-Plating and Over-Engineering is unfortunate, unnecessary, and costly. And Gold-Plating is not refactoring. It is quite the opposite of refactoring.
- Refactoring is not imprecise, code “cleanup.” Refactoring is not just twiddling with the code, “cleaning it up a little,” without a precise sense of what each problem being fixed is, why we would bother, and what the cleaned-up state will look like. This kind of code twiddling is difficult to predict or manage consistently for a single programmer, much less an entire team.
- Refactoring is not feature rework – it is not going back and writing something over because a requirement was missed or misunderstood, or the architecture was fundamentally flawed, or integrating with another system proves to be impossible, or defects that could have been discovered working Test First ended up being discovered in production. Refactoring is not “touching the product twice,” it is not a bottleneck. Rework is unfortunate, and it is something that the best agile teams can eventually grow out of, but rework is not refactoring. Refactoring helps prevent rework.
- Refactoring is not a “nice-to-have” luxury. Without Continuous Refactoring, every codebase will try desperately to become an inextensible, intractable, crazy-making BigBallOfMud, threatening you with the cost and inconvenience of rewriting the entire system from scratch. Perhaps not in release 2, or release 3, but eventually. The cost of decay curve is very real.
What Continuous Refactoring Is
Refactoring is Described Best in a Single, Important Book
Refactoring, as described in Martin Fowler’s seminal and canonical book on the subject, is a pattern language for keeping code clean, clear, and extensible. (This book is arguably one of the most influential in the world of software design, having been sited 99 times at last count by various other books and articles. It’s one of the top-selling software technical books of all time.)
Fowler’s Refactoring pattern language has two distinct, important parts:
- A set of 22 named, precisely described “code-smells.” These are specific varieties of “code debt,” several of which will eventually bring down a codebase if left unchecked. Some of the “code smells” (like Large Class, Long Method, and Inappropriate Intimacy) are much worse than others, and it takes a refactoring veteran to know which ones are worth fixing sooner, which ones are worth fixing later, and which ones don’t matter all that much.
- A set of 72 named, precisely described “refactorings,” or recipes for removing the code smells. Though some of these refactorings can only be performed manually, step-by-step, most of the important refactorings can now be performed automatically simply by selecting them from an Automated Refactoring menu, in integrated development environments (IDEs) like Eclipse, Intelij Idea, NetBeans, or in VisualStudio plugins like Resharper. The cost of doing small refactorings is getting cheaper every day.
If your team does not use Fowler’s refactoring pattern language to describe and understand what is bad and good about the code’s design, then they are likely not really refactoring. They may say they are cleaning up the code, and they may say they are refactoring, but unless they can describe to you how using Extract Method helps cure the Deodorant Comment smell or the Long Method smell, then they have more learning to do.
Refactoring Breeds Programming Skill and Courage
And More Skill and Courage Equals Faster ROI
When teams get in the habit of refactoring code, they become more skillful at changing code in general. This is important, because many teams have a pervasive fear of changing the code, and presume that changes are difficult and painful, as they have always tended to be. Such fearful teams tend to reflexively let small but dangerous problems slide until they become big problems.
But as teams get really good at Continuous Refactoring, keeping the code as extensible as it can be, they become more fearless about experimenting with optimal designs, optimal performance, code re-use, and Object Oriented work in general. As their skill grows, their systems tend to get more and more extensible and reusable. They learn faster, they learn more. They can see problems in the code faster, and they can imagine making the necessary corrections faster. They keep the software softer, and they do it cheaper and faster all the time.
The best programming teams I know are expert test-drivers, expert at refactoring, and expert at Object Oriented design. They can spot a serious problem getting ready to happen in the code faster than you can spot bald tires on your daughter’s boyfriend’s Mustang. They can test-drive and refactor three different candidate designs into and back out of place faster than an average team can laboriously code a single bad design with no tests.
Really good teams can quickly tell the difference between under-engineering and over-engineering, and they can quickly determine which code smells they can afford to fix in the current iteration, which smells must be marked down for later, and which they can afford to simply let slide indefinitely.
These higher levels of programming skill and courage pay off in myriad profound and subtle ways as codebases grow increasingly complex. But high overall refactoring skill ultimately boils down to higher overall design and programming bandwidth for the entire team – in other words, higher overall velocity, and faster and better ROI.
Where to Start?
If your team does not understand refactoring, help them learn it, and consider hiring experts to train and mentor them in this vital practice.
There are many resources in print and on-line, but the points below capture the gist of an overall refactoring strategy and policy:
- Every programmer should know how to get high enough test-coverage (or “code coverage“) rates to make refactoring safe to do. You cannot refactor if your code is insufficiently protected by tests. Use books, on-line resources, and on-site training and mentoring to get your team up-to-speed on unit-testing and related best practices like working Test-First, and the entire taxonomy of automated testing: isolation testing, collaboration testing, automated end-to-end testing, automated acceptance testing, and Continuous Integration.
- Every programmer should know the code smells and refactorings from Fowler’s book, mentioned above. Just as the original Object Oriented pattern language of Design Patterns (like Factory, Strategy, or Command) published more than a decade ago by the “Gang of Four” have become a standard part of the Object Oriented software curriculum, so too is Fowler’s refactoring pattern language. And just as with Design Patterns, if you can’t speak the pattern language, then you can’t see the patterns, you can’t see the dangers, and you can’t make the fixes. It’s as simple as that. You are then in that dangerous position of not knowing what you don’t know.
- Every programmer should know how to refactor continuously, in the smallest chunks possible, as they go. The original three-part recipe from XP – write a test, write the code for the test, then refactor any little mess left behind – is still the best recipe going.
- Prefer IDE’s and plugins that make automated refactoring cheap to do. Some of these tools are open-source, and get better all the time, making refactoring cheaper and cheaper each year.
- Plan for refactoring as part of each iteration’s set of tasks, or as something that occurs during a designated buffer or slack period. Presume that some amount of each programming hour, day, and week will routinely be dedicated to refactorings that keep the system as clear and clean as the schedule permits. Encourage programmers to make estimates for a requirement or task that take refactoring time into account.
- Use automated plugins to Continuous Integration systems like CruiseControl to actually measure how well or how badly factored your system is. Many code smells can easily be detected by such tools, and reported via web page and email to the entire team. I personally have been using a tool recently called crap4j for Java codebases; it produces a single weighted metric that describes how well you have test-protected the modules in your codebase that have the highest cyclomatic complexity. Have the team or team leadership decide on acceptable thresholds for the metrics that matter most to you, like lines of code per method, number of methods per class, cyclomatic complexity, or the number of chained delegating accessor calls in a single statement (so-called “train wrecks”). Take seriously the possibility that if it wasn’t measured, it didn’t happen. Again, don’t be in that position where you don’t know what you don’t know.
- Have lunch-and-learns to enable team-members to cheaply and routinely drill refactoring-related skills and patterns into each other’s heads, and to keep each other honest.
- Help spread the news that refactoring is not bad! Refactoring is not waste, but the systematic removal of waste as we go. It is the epitome of software professionalism, responsibility, and craftsmanship. Refactoring saves some businesses millions of dollars already, and has the potential to save billions. How much can it save you?