When de­vel­op­ing an ap­plic­a­tion, in­el­eg­antly struc­tured sections can ac­cu­mu­late in the source code which impairs the usability and com­pat­ib­il­ity of the program. The solution is either an entirely new source code or re­struc­tur­ing in small steps. Many pro­gram­mers and companies in­creas­ingly opt for code re­fact­or­ing in order to optimise func­tion­ing software over the long term and make it more legible and clearer for other pro­gram­mers.

During the re­fact­or­ing process, the question is raised about which problem in the code should be solved with which method. Re­fact­or­ing is meanwhile con­sidered to be among the basics when learning to code and is becoming more and more important. Which methods are used to this end and what are the ad­vant­ages and dis­ad­vant­ages?

What is re­fact­or­ing?

Pro­gram­ming software is a lengthy process that can involve multiple de­velopers. Written source code is often revised, changed, and expanded during this work. As a result of time pressure or outdated practices, inelegant sections can ac­cu­mu­late in the source code. These are known as code smells. These weak spots that accrue over time endanger the usability and com­pat­ib­il­ity of the program. To prevent this gradual erosion and de­teri­or­a­tion of the software, re­fact­or­ing is necessary.

In principle, re­fact­or­ing is similar to editing a book. The practice of editing does not create a com­pletely new book, but instead a more un­der­stand­able text. Just like various ap­proaches exist in editing such as cutting, re­for­mu­lat­ing, deleting, and re­struc­tur­ing, code re­fact­or­ing likewise en­com­passes a number of methods like en­cap­su­la­tion, re­format­ting, or ex­trac­tion in order to optimise a code without changing its function.

This process is much more cost-effective than preparing an entirely new code structure. Es­pe­cially in iterative and in­cre­ment­al software de­vel­op­ment, as well as agile software de­vel­op­ment, re­fact­or­ing plays a major role, since pro­gram­mers fre­quently need to alter software in these cyclical models. In this context, re­fact­or­ing is a fixed step in the workflow.

When source code de­teri­or­ates: spaghetti code

First, it’s important to un­der­stand how code can age and mutate into spaghetti code. Whether due to time pressure, lack of ex­per­i­ence, or unclear in­struc­tions, pro­gram­ming code can lead to a loss of func­tion­al­ity as a result of un­ne­ces­sar­ily com­plic­ated commands. A code de­teri­or­ates in­creas­ingly, the faster and more complex an area of ap­plic­a­tion is.

Spaghetti code refers to confusing, un­read­able source code that can only be in­ter­preted by pro­gram­mers with great dif­fi­culty. Simple examples of confusing code include su­per­flu­ous jump commands (GOTO) that instruct the program to skip back and forth in the source code, or un­ne­ces­sary for/while loops and if commands.

Projects involving many software de­velopers are par­tic­u­larly sus­cept­ible to unclear source code. When code passes through many hands and if the original already contains some weak points, a growing mess resulting from “work­around solutions” can hardly be avoided, ne­ces­sit­at­ing a costly code review. In severe cases, spaghetti code can jeop­ard­ise the entire de­vel­op­ment of software. If the problem gets that far, it may even be too late for code re­fact­or­ing.

Code smells and code rot are not quite so dis­astrous. Over time, a code can start to smell – meta­phor­ic­ally – with all its inelegant sections. Difficult-to-un­der­stand parts become worse as other pro­gram­mers intervene or add new strings. If re­fact­or­ing is not performed at the first signs of code smell, the source code will gradually lose func­tion­al­ity as a result of code rot.

The aim of re­fact­or­ing

The intention behind re­fact­or­ing is simply to achieve better code. Effective code allows new code elements to be in­teg­rated better without in­tro­du­cing new errors. Pro­gram­mers who can ef­fort­lessly read the code will be able to fa­mil­i­ar­ise them­selves with a de­vel­op­ing ap­plic­a­tion faster and remove or avoid bugs more easily. Another goal of re­fact­or­ing is to improve error analysis and the main­tain­ab­il­ity of software. The work of pro­gram­mers reviewing code is therefore sim­pli­fied con­sid­er­ably.

What sources of errors does re­fact­or­ing solve?

The tech­niques applied in re­fact­or­ing are as varied as the errors they’re intended to remove. Es­sen­tially, code re­fact­or­ing is defined by its errors and en­com­passes the steps required to shorten or remove a solution approach. Sources of errors that can be resolved with re­fact­or­ing methods include:

  • Confusing or excessive code: Command strings and blocks are so long that external pro­gram­mers will be unable to un­der­stand the internal logic of the software.
  • Code du­plic­a­tions (re­dund­an­cies): Unclear code often contains re­dund­an­cies that have to be changed sep­ar­ately at each oc­cur­rence during main­ten­ance, thereby wasting time and resources.
  • Excessive parameter lists: Objects are not assigned directly to a method but their at­trib­utes are conveyed in a parameter list.
  • Classes with too many functions: Classes with too many functions defined as methods – also known as god objects –make adjusting the software almost im­possible.
  • Classes with too few functions: Classes with so few functions defined as methods that they are un­ne­ces­sary.
  • Overly general code with special cases: Functions with too specific special cases that hardly ever occur – if at all – and therefore make adding necessary ex­ten­sions more difficult.
  • Middle man: A separate class acts as a “middle man” between methods and various classes, instead of directing calls from methods directly to a class.

What approach does re­fact­or­ing involve?

Re­fact­or­ing should always be performed before changing a program function. It ideally involves very small steps, with code changes tested using software de­vel­op­ment processes like test-driven de­vel­op­ment (TDD) and con­tinu­ous in­teg­ra­tion (CI). In a nutshell, TDD and CI refer to the con­tinu­ous testing of small, new code sections that pro­gram­mers build, integrate, and test in terms of their func­tion­al­ity – often with automated test runs.

As a rule, only change the program in small steps in­tern­ally, without affecting the external function. After each change, you should run an automated test run if possible.

What tech­niques exist

A range of re­fact­or­ing tech­niques exist. A complete overview can be found in the com­pre­hens­ive book on re­fact­or­ing by Martin Fowler and Kent Beck: Re­fact­or­ing: Improving the Design of Existing Code. Here’s a brief summary:

Red-green de­vel­op­ment

Red-green de­vel­op­ment is a test-driven method of agile software de­vel­op­ment. It is used when a new function is to be in­teg­rated into existing code. Red stands for the first test run prior to im­ple­ment­ing a new function in the code. Green stands for the simplest possible code section required for the function in order to pass the test. As a result, an extension is prepared with constant test runs to filter out defective code and increase func­tion­al­ity. Red-green de­vel­op­ment provides a found­a­tion for con­tinu­ous re­fact­or­ing in con­tinu­ous software de­vel­op­ment.

Branching by ab­strac­tion

This re­fact­or­ing method describes a gradual change to a system and the con­ver­sion of old, im­ple­men­ted code into new, in­teg­rated sections. Branching by ab­strac­tion is typically used for large ap­plic­a­tions that involve class hier­arch­ies, in­her­it­ance, and ex­trac­tion. By im­ple­ment­ing an ab­strac­tion that remains linked to an old im­ple­ment­a­tion, other methods and classes can be linked with the ab­strac­tion and the func­tion­al­ity of the old code section can be replaced by ab­strac­tion.

This often occurs via pull-up or push-down methods. They link to a new, better function with the ab­strac­tion and transfer the links to it. In doing so, they either move a sub-class to a higher class (pull-up) or divide a higher class into sub-classes (push-down).

You can then delete the old functions without en­dan­ger­ing the overall func­tion­al­ity. With these small changes, the system works unchanged while you gradually replace inelegant code with neat code, section by section.

Compiling methods

Re­fact­or­ing is intended to make code methods as legible as possible. Ideally, external pro­gram­mers should be able to grasp the internal logic of a method when reading the code. There are a number of different tech­niques for ef­fi­ciently compiling methods. The aim of each change is to harmonise methods, remove re­dund­an­cies, and split ex­cess­ively long methods into separate sections, thereby opening them up to future changes.

Such tech­niques include:

  • Method ex­trac­tion
  • Method inlining
  • Removing temporary variables
  • Replacing temporary variables with a request method
  • In­tro­du­cing de­script­ive variables
  • Sep­ar­at­ing temporary variables
  • Removing as­sign­ments to parameter variables
  • Replacing a method with a method object
  • Replacing an algorithm

Moving at­trib­utes between classes

To improve code, you need to move at­trib­utes or methods between classes. Here, the following tech­niques are used:

  • Move method
  • Move attribute
  • Extract class
  • Inline class
  • Hide delegate
  • Remove class in the middle
  • Introduce extrinsic method
  • Introduce local extension

Data or­gan­isa­tion

This method aims to divide data into classes and keep them as neat and clear as possible. You should remove un­ne­ces­sary links between classes, which impair the software func­tion­al­ity in the event of minor changes, and divide them into coherent classes.

Examples of tech­niques include:

  • En­cap­su­lat­ing own attribute accesses
  • Replacing own at­trib­utes with an object reference
  • Replacing a value with a reference
  • Replacing a reference with a value
  • Linking ob­serv­able data
  • En­cap­su­lat­ing at­trib­utes
  • Replacing a dataset with a data class

Sim­pli­fy­ing con­di­tion­al ex­pres­sions

While re­fact­or­ing, you should simplify con­di­tion­al ex­pres­sions as far as possible. The following tech­niques can be applied to this end:

  • Stripping con­di­tions
  • Merging con­di­tion­al ex­pres­sions
  • Merging repeated in­struc­tions in con­di­tion­al ex­pres­sions
  • Removing control switches
  • Replacing nestled con­di­tions with guard clauses
  • Replacing case dis­tinc­tions with poly­morph­ism
  • In­tro­du­cing zero-objects

Sim­pli­fy­ing method requests

Method requests can be run faster and more easily using the following methods, for example:

  • Renaming methods
  • Adding para­met­ers
  • Removing para­met­ers
  • Replacing para­met­ers with explicit methods
  • Replacing error codes with ex­cep­tions

Re­fact­or­ing example: renaming methods

The following example shows that the method naming in the original code does not make its func­tion­al­ity clear and easy to un­der­stand. The method is intended to output a ZIP code for an office address, but it doesn’t indicate this task directly in the code. To formulate the code more clearly, it’s a good idea to rename the method in the process of code re­fact­or­ing.

Before:

String getPostalCode() {
	return (theOfficePostalCode+“/“+theOfficeNumber);
}
System.out.print(getPostalCode());

After:

String getOfficePostalCode() {
	return (theOfficePostalCode+“/“+theOfficeNumber);
}
System.out.print(getOfficePostalCode());

Re­fact­or­ing: ad­vant­ages and dis­ad­vant­ages

Ad­vant­ages Dis­ad­vant­ages
Better com­pre­hens­ib­il­ity fa­cil­it­ates main­ten­ance and the ex­tend­ib­il­ity of the software Imprecise re­fact­or­ing could introduce new bugs and errors into the code
Re­struc­tur­ing the source code is possible without altering the func­tion­al­ity There is no clear defin­i­tion of “neat code”
Improved legib­il­ity improves the com­pre­hens­ib­il­ity of the code for other pro­gram­mers An improved code is often difficult for the customer to recognise, since the func­tion­al­ity stays the same, i.e. the benefit is not self-evident
Removed re­dund­an­cies and du­plic­a­tions improve the ef­fect­ive­ness of the code In the case of larger teams working on re­fact­or­ing, the co­ordin­a­tion effort required could be sur­pris­ingly high
Self-contained methods prevent local changes from having an effect on other parts of the code
Clean code with shorter, self-contained methods and classes is char­ac­ter­ised by better test­abil­ity

In general, when re­fact­or­ing, introduce new functions only when the existing source code is to remain unchanged. Only alter the source code – i.e. carry out re­fact­or­ing – when you are not adding any new functions.

Go to Main Menu