Addressing Duplicates in your code
Code duplication is against the software engineering best practice of code reusability. Some of the major disadvantages of code duplication are the following
- It Increases the number of Lines of Code (LOC), which impacts the performance of the software.
- Need to write extra unit tests to cover each duplicate method to maintain a good coverage.
- Needs to make changes in multiple files for a change due to code duplication. This will impact the maintenance cost
- Highlights the lack of quality of the software team.
Different types of Duplication
- Different Methods but with Identical LOC
- Same Method with Identical LOC but in different class files
- Identical set of LOC in multiple methods
- Similar LOC
This article addresses the first three category of code duplication.
Duplication Elimination Procedure
Here are some of the common methods through which we can remove these duplication.
Different Methods but with Identical LOC & same Method with Identical LOC but in different class files
Example: methods – m1 () and m2() which contain the same identical LOC.
Method-1 : Delete and Redirect approach
- Maintain one method (e.g.: m1 () ) which will be used throughout the software.
- Delete the contents of other methods (e.g.: m2() and replace it with actual method call – m1()
[code]
public void m1() {
// LOC
}
public void m2() {
// removing its content and replace with m1() call
m1()
}
public void m3() {
// removing its content and replace with m1() call
m1()
}
[/code]
- Maintain one method (e.g.: m1() )which will be used throughout the software.
- Delete all other identical methods. (e.g.: delete m2(), m3() etc which are all identical methods of m1() )
- Identify the code location from which the deleted methods are referenced and replace it with the unique method. (e.g.: All calls to m2() and m3() must be replaced with m1()
Identical set of LOC in multiple methods
The elimination procedure is slightly complicated than the previous ones for this scenario.
- Identify a less complex method which contains this identical code and make sure that it has Unit tests with good coverage. e.g.: m1 ()
- Create a new method and copy all the identical LOC to that method. e.g.: mn()
- Check whether these LOC is using any parameter / attribute reference which were a part of the parent method and if so add that to the method signature. e.g.: if the LOC in mn() is referencing to an amount parameter then re-define the method signature as mn (int amount)
- Replace the LOC in parent method with the new method reference and passing the relevant parameters. Example mn( 100)
- Run all the unit tests for the parent method (e.g.: m1() ) and make sure it all got passed.
- Now apply step-4 and step-5 to other duplicate methods sharing the same identical LOC. e.g.: if m2() and m3() also has the same
- LOC as in m1() which was moved to mn(int amount), then delete those LOC from m2() and m3() and replace it with mn() call.
Where to create the new methods
In the above mentioned elimination approaches we are creating a new method. However where to maintain this new method depends on the nature of the method. However here are some generic guidelines.
- If it is a common method like Date formatting, it can be maintained in a library or utility class which can be used by all classes.
- If it is a method in a derived class, then move to the base class.
- If they are methods in two different classes, then check for the feasibility of introducing a base class. If the base class is not meaningful, consider it moving to a utility class.
There is one more kind of code duplication – Similar LOC. They are not identical but behavior is similar. This will be addressed in a separate article.