Skip to main content
CleanCodeMastery

Duplicate Code: Writing the Same Address on 50 Wedding Cards

Learn the Duplicate Code smell with a wedding card story. Understand DRY, the Rule of Three, and how Extract Method removes dangerous copy-paste code.

20 min read Updated June 11, 2026beginner
code-smellsdispensablesduplicate-codedryrule-of-threerefactoringtypescriptcsharp

๐Ÿ’Œ Fifty wedding cards, one aching hand

Big news in the Sharma family of Jaipur โ€” Anjali didi is getting married! The whole house smells of laddoos. Relatives are calling from three states. And the wedding cards have just arrived from the printer, in a big cardboard box tied with golden thread. They look beautiful โ€” cream paper, red lettering, a little peacock embossed in the corner.

But there is one problem. Papa opens a card, reads it twice, and his face falls. The printer forgot to print the venue address. Fifty cards, and not one of them says where the wedding is.

There is no time to reprint. So Anjali's younger brother Kabir, fourteen years old and famous for his neat handwriting, is given the job. "Write the address on every card," says Papa. "All fifty of them. Neatly."

So Kabir sits down at the dining table with a blue pen. "Shubham Garden, Plot 14, Tonk Road, Jaipur โ€” 302018." He writes it once. Beautiful. Twice. Still good. Ten times. His hand starts aching. By card twenty, his little sister is watching cartoons and he wants to join her. By card thirty, his writing is getting lazy and slanted. On card thirty-four, he writes "Plot 41" instead of "Plot 14". On card forty-two, he forgets the PIN code entirely. He does not notice either mistake. Nobody checks fifty cards one by one โ€” who has the time?

Figure 1: Kabir's evening, card by card โ€” quality falls as copies grow

Two weeks later, Sharma uncle's family reaches Plot 41 โ€” an empty plot with one sleeping dog โ€” in their full wedding clothes. They are very annoyed. They phone Papa from the empty plot. Papa apologises eleven times.

And then the worst news arrives: the venue changes. Shubham Garden got double-booked, and the wedding moves to "Mangal Vatika, Ajmer Road". Now somebody must find all fifty cards โ€” half of them already posted! โ€” and correct every single one by hand. It is impossible. The family ends up calling every guest one by one, and even then, two families show up at the old venue.

Now compare this with what the printer should have done in the first place: keep the address in one place โ€” the printing plate โ€” and stamp all fifty cards from it. One correction on the plate, and all fifty cards are correct. No aching hand, no Plot 41, no missing PIN code, no calling sixty relatives.

This is exactly the Duplicate Code smell. Writing the same thing by hand in many places feels simple, but every copy is a chance to make a mistake, and every future change must hunt down every copy.

๐Ÿค” What is this smell?

Duplicate Code is the same idea expressed in more than one place in a program. It may be an exact copy-paste, or two blocks that look slightly different but carry the same rule. Whenever that rule changes, every copy must be found and changed โ€” correctly โ€” or the copies start to disagree.

Duplicate Code is famous. It is the very first smell described in Martin Fowler's book Refactoring. Fowler and Kent Beck put it first for a reason: it is the most common smell and one of the most damaging.

Its cure is connected to a famous principle from the book The Pragmatic Programmer by Andy Hunt and Dave Thomas: DRY โ€” Don't Repeat Yourself. DRY says that every piece of knowledge in a system should have a single, authoritative home. The venue address is one piece of knowledge. It should live on one printing plate, not in fifty handwritten copies.

๐Ÿ’ก

Read DRY carefully: it talks about knowledge, not text. Two blocks of code that look identical but represent different business rules are not true duplication โ€” they only look alike by accident today. True duplication is one rule living in many places. Merge rules, not lookalikes.

College corner: Research on code clones gives this smell a formal vocabulary. A Type-1 clone is an exact copy (whitespace aside). Type-2 renames identifiers but keeps structure. Type-3 adds or deletes a few statements. Type-4 is semantically equivalent code with different syntax โ€” the same job done a different way. Clone-detection tools (PMD CPD, SonarQube, jscpd, Simian) are reliable on Types 1โ€“2, weaker on Type 3, and nearly blind to Type 4. That asymmetry matters: the clones that tools cannot find are exactly the ones only a human reader who understands the meaning can catch. Studies on industrial codebases routinely measure 5โ€“20% cloned code, and find that inconsistent edits to clone groups are a significant source of defects.

Here is the whole territory in one map:

Figure 2: The full map of the Duplicate Code smell

๐Ÿ” How to spot it

Run through this checklist on any codebase:

  • Two methods that look almost the same, differing only in one number, one type, or one called method.
  • The same sequence of statements appearing in several subclasses of the same parent.
  • A bug that was "fixed" but appears again somewhere else โ€” because a copy of the buggy code was never fixed.
  • Team habit of "I copied the existing handler and changed two lines" for every new feature.
  • The same constant, regex pattern, or formula typed by hand in multiple files.
  • Parallel if/switch ladders in different files that all list the same set of cases.

Here is a quick table of duplication types, from easiest to hardest to see:

TypeWhat it looks likeHow visible?Example
Exact copySame lines, character by characterEasy โ€” tools catch itCopy-pasted validation block
Copy with renamed variablesSame logic, different namesMediumtotal/sum, cust/customer
Same steps, different detailsSame skeleton, one step differsHardDomestic vs international billing
Same job, different algorithmTwo ways to compute one answerVery hardLoop in one file, formula in another
Knowledge duplicationOne rule in code AND config AND docsHardestGST rate in three layers

The lower rows are the dangerous ones โ€” no tool can fully catch them. Only a reader who understands the meaning can say, "Wait, these two blocks are the same rule."

When teams audit where their duplication came from, the sources usually split like this:

Figure 3: Where duplicate code typically comes from in a real team

โš ๏ธ Why it is a problem

Problem 1: Every change is multiplied. A rule in one place is changed once. A rule copied five times must be found five times and edited five times. Miss one, and your program now follows two different rules at the same time โ€” like fifty cards showing two different venues.

Problem 2: Copies disagree silently. Nobody announces "copy three is now different!" The mismatch hides until a customer hits it. Remember Sharma uncle at Plot 41 โ€” he discovered the bug in production, in his wedding clothes.

Problem 3: Bugs resurrect. You fix a bug in one copy and close the ticket. Months later the same bug walks in again through another copy. The team thinks the fix "did not work". Trust drops. Watch the resurrection happen:

Figure 4: A bug fixed in one copy comes back through another copy

Problem 4: The design becomes invisible. When one concept is named once and reused, the design announces itself: "this is the subtotal rule." When it is smeared across copies, every reader must rediscover that these blocks are the same thing. That is wasted brainpower, every day, for every reader.

Problem 5: The cost grows with every copy. The pain is not linear โ€” with more copies you spend time finding them, editing them, testing them, and double-checking you did not miss one:

Figure 5: Minutes needed to apply one rule change as copies multiply

And here is the slow life story of a single pasted block. Notice that drift never knocks on the door โ€” it just happens:

Figure 6: The life of a pasted block โ€” copies drift apart silently
Figure 7: One copy-pasted bug outliving its own bug fix

๐Ÿ’ป A real-life code example

Let us put the wedding card story into code. The family hires an event app to print and send invitations. A junior developer wrote it โ€” by copy-paste, of course.

// Smelly version: the "address rule" is hand-written in three places
class InvitationService {
  printCard(guest: Guest): string {
    const name = guest.title + " " + guest.firstName + " " + guest.lastName;
    return (
      "Dear " + name + ",\n" +
      "You are invited!\n" +
      "Venue: Shubham Garden, Plot 14, Tonk Road, Jaipur - 302018"
    );
  }
 
  sendWhatsApp(guest: Guest): string {
    const name = guest.title + " " + guest.firstName + " " + guest.lastName;
    return (
      "Namaste " + name + "! Wedding invitation: " +
      "Venue: Shubham Garden, Plot 14, Tonk Rd, Jaipur 302018"
    );
  }
 
  sendEmail(guest: Guest): string {
    const name = guest.title + " " + guest.firstName + " " + guest.lastName;
    return (
      "Dear " + name + ", you are cordially invited. " +
      "Venue: Shubham Gardens, Plot 14, Tonk Road, Jaipur - 302018"
    );
  }
}

Look closely. The smell is everywhere:

  1. The guest name formula (title + first + last) is written three times. If the family decides to add "ji" after every name, that is three edits.
  2. The venue address is written three times โ€” and the copies already disagree! The WhatsApp version says "Tonk Rd", the email says "Shubham Gardens". The copies have drifted, exactly like Kabir's tired handwriting on card thirty-four.
  3. When the venue changes to Mangal Vatika, someone must find and fix all three โ€” and any fourth copy hiding in some other file.

๐Ÿงน Cleaning it up, step by step

Step 1: Find the knowledge. Ask: what pieces of knowledge are repeated here? Two of them: "how to write a guest's full name" and "what the venue address is".

Step 2: Give each piece one home. Use Extract Method for the name formula, and a single constant for the address. This is the printing plate.

Step 3: Make every caller use the one home. Replace each handwritten copy with a call.

// Clean version: one printing plate, many stamps
const VENUE_ADDRESS = "Shubham Garden, Plot 14, Tonk Road, Jaipur - 302018";
 
class InvitationService {
  printCard(guest: Guest): string {
    return `Dear ${this.fullName(guest)},\nYou are invited!\nVenue: ${VENUE_ADDRESS}`;
  }
 
  sendWhatsApp(guest: Guest): string {
    return `Namaste ${this.fullName(guest)}! Wedding invitation: Venue: ${VENUE_ADDRESS}`;
  }
 
  sendEmail(guest: Guest): string {
    return `Dear ${this.fullName(guest)}, you are cordially invited. Venue: ${VENUE_ADDRESS}`;
  }
 
  private fullName(guest: Guest): string {
    return `${guest.title} ${guest.firstName} ${guest.lastName}`;
  }
}

Now the venue change is one edit. The name rule is one edit. The copies physically cannot drift apart, because there are no copies โ€” only one plate and many stamps. The cleaned structure looks like this:

Figure 8: After refactoring โ€” three senders share one name rule and one address

Step 4: Choose the right tool for harder duplication. Our example was inside one class, so Extract Method was enough. But duplication lives in other places too, and each location has its own cure:

  • Identical methods sitting in sibling subclasses? Move the method up to the parent with Pull Up Method.
  • Subclass methods with the same steps but different details? Keep the skeleton in the parent and let children fill in the differing steps with Form Template Method.
  • Copies scattered across unrelated classes that secretly share a concept? Give that concept its own home with Extract Class.
  • Two blocks doing the same job in different ways? Pick the clearer way and replace both using Substitute Algorithm.
Figure 9: Choosing the right cure based on where the duplication lives

๐ŸŸฆ The same smell in C#

Two billing methods that are twins, except for one number:

// Before: the subtotal loop is duplicated; only the rate differs
public decimal DomesticTotal(List<Item> items)
{
    decimal subtotal = 0;
    foreach (var i in items) subtotal += i.Price * i.Quantity;
    return subtotal + subtotal * 0.05m;   // domestic shipping
}
 
public decimal InternationalTotal(List<Item> items)
{
    decimal subtotal = 0;
    foreach (var i in items) subtotal += i.Price * i.Quantity;
    return subtotal + subtotal * 0.18m;   // international shipping
}

Extract the shared shape; let the difference become a parameter:

// After: one definition of subtotal, one definition of shipping
public decimal DomesticTotal(List<Item> items) => TotalWithShipping(items, 0.05m);
public decimal InternationalTotal(List<Item> items) => TotalWithShipping(items, 0.18m);
 
private static decimal TotalWithShipping(List<Item> items, decimal shippingRate)
{
    var subtotal = items.Sum(i => i.Price * i.Quantity);
    return subtotal + subtotal * shippingRate;
}

Now "how a subtotal is computed" exists exactly once. If tomorrow the business says "ignore items with zero quantity", that is one edit, and domestic and international can never disagree about it.

A Python taste of the same medicine, because copy-paste speaks every language:

# Before: the same "clean phone number" rule, typed twice
def save_guest(name, phone):
    phone = phone.replace(" ", "").replace("-", "")[-10:]
    db.guests.insert(name, phone)
 
def send_invite_sms(phone, text):
    phone = phone.replace(" ", "").replace("-", "")[-10:]
    sms.send(phone, text)
 
# After: one rule, one home
def normalize_phone(phone: str) -> str:
    return phone.replace(" ", "").replace("-", "")[-10:]
 
def save_guest(name, phone):
    db.guests.insert(name, normalize_phone(phone))
 
def send_invite_sms(phone, text):
    sms.send(normalize_phone(phone), text)

๐Ÿข Where this smell hides in real projects

  • Validation rules copied between frontend and backend. The email regex lives in the React form and in the API controller โ€” slightly different in each. Users get "valid" on screen and "invalid" from the server.
  • Copy-paste-driven feature development. "Make the new report? Just copy the old report handler and adjust." After ten reports, a bug in the shared logic needs ten fixes.
  • The same formula in code and in SQL. Discount computed in the application and re-computed inside a database view. They drift; finance notices at year-end.
  • Test code duplication. Twenty tests each building the same five-line test order by hand. One constructor change breaks all twenty.
  • Cross-team duplication. Two teams each write their own "retry helper" in the same month because neither knew the other existed. Code search and shared libraries are the medicine.
  • AI-generated code. Code assistants happily generate a fresh copy of logic instead of finding the existing helper. Review generated code for duplication just like human code.

College corner: There is a deeper systems argument here, from The Pragmatic Programmer: DRY violations break what the authors call the "single source of truth" property, and the failure mode is representational drift โ€” two representations of one fact evolving independently. This is the same root problem as cache invalidation, denormalized databases, and documentation rot. Whenever you intentionally duplicate knowledge (for performance, for decoupling deployments, for offline copies), you must also build a synchronization mechanism โ€” code generation from one schema, contract tests between frontend and backend, or a build step that derives one copy from the other. Duplication without synchronization is a time bomb; duplication with synchronization is an engineering decision.

โš–๏ธ When it is okay to ignore

Here is the honest part. Not every repetition should be merged, and merging too early causes a different disease: the wrong abstraction.

SituationMerge the copies?Why
Same business rule, copiedYesOne rule must have one home
Looks similar, but changes for different reasonsNoIncidental duplication; merging couples strangers
Second occurrence, shape still unclearWaitRule of Three: refactor at the third copy
Two-line fragment used in one or two placesUsually noA tiny helper adds indirection, removes little
Tests that repeat for readabilityOften noA test should be readable alone, on one screen
Same constant in many filesYesConstants are cheap to centralize, drift is costly

Two famous guidelines help you judge:

  1. The Rule of Three (popularized in Fowler's Refactoring, credited to Don Roberts): write it once. Copy it once, and just wince. When you need it a third time โ€” refactor. By then you have three real examples, so you can see the true shared shape instead of guessing it.
  2. Sandi Metz's warning: "Duplication is far cheaper than the wrong abstraction." If you merge two blocks that were only accidentally similar, you must later thread flags and parameters through the shared code to pull them apart again. That tangled "shared" code is worse than the honest copies were.

You can place any suspicious pair of code blocks on this chart and read off the decision:

Figure 10: Should these two similar blocks be merged?
โš ๏ธ

Before merging two similar blocks, ask one question: "If the business changes one of these, must the other change too?" If yes โ€” same knowledge, merge them. If no โ€” they are strangers who happen to dress alike today. Let them stay separate, and do not feel guilty about it.

๐Ÿ› ๏ธ Which refactorings cure it

Where the duplication livesCuring refactoring
Inside one classExtract Method
Identical methods in sibling subclassesPull Up Method
Same steps, different details, in subclassesForm Template Method
Scattered across unrelated classesExtract Class
Same job done two different waysSubstitute Algorithm
Long methods grown from pasted blocksExtract Method + Consolidate Duplicate Conditional Fragments

๐Ÿ“ฆ Quick revision box

+--------------------------------------------------------------+
|  DUPLICATE CODE โ€” QUICK REVISION                             |
+--------------------------------------------------------------+
|  Story   : Hand-writing one address on 50 wedding cards      |
|            instead of printing from one plate.               |
|  Smell   : The same KNOWLEDGE living in many places.         |
|  Danger  : Every change -> many edits; missed copy ->        |
|            silent disagreement -> bug found by a customer.   |
|  DRY     : Every piece of knowledge has ONE home.            |
|            (The Pragmatic Programmer)                        |
|  Rule of : 1st time write, 2nd time wince,                   |
|  Three     3rd time refactor.                                |
|  Caution : Lookalikes that change for different reasons      |
|            are NOT duplication. Wrong abstraction > copies.  |
|  Cures   : Extract Method, Pull Up Method, Form Template     |
|            Method, Extract Class, Substitute Algorithm.      |
+--------------------------------------------------------------+

โœ๏ธ Practice exercise

A school fee program has grown by copy-paste. Clean it up.

function tuitionFeeReceipt(student: Student): string {
  let fee = 2000;
  if (student.hasSibling) fee = fee - fee * 0.1;   // sibling discount
  if (student.isStaffChild) fee = fee - fee * 0.5; // staff discount
  return "Receipt for " + student.name + ": Rs " + fee + " (Tuition)";
}
 
function busFeeReceipt(student: Student): string {
  let fee = 800;
  if (student.hasSibling) fee = fee - fee * 0.1;
  if (student.isStaffChild) fee = fee - fee * 0.5;
  return "Receipt for " + student.name + ": Rs " + fee + " (Bus)";
}
 
function labFeeReceipt(student: Student): string {
  let fee = 500;
  if (student.hasSibling) fee = fee - fee * 0.1;
  if (student.isStaffChild) fee = fee - fee * 0.45; // <-- bug? or rule?
  return "Receipt for " + student.name + ": Rs " + fee + " (Lab)";
}

Your tasks:

  1. List the pieces of knowledge that are duplicated. (Hint: there are at least two โ€” the discount rules and the receipt format.)
  2. Extract a applyDiscounts(fee, student) function and a formatReceipt(name, fee, feeType) function. Rewrite all three receipt functions as one-liners using them.
  3. Investigate the 0.45 in labFeeReceipt. Is it a typo that drifted, or a real special rule? Write one sentence for each possibility, and explain what you would do in each case. (This is exactly the "copies disagree silently" problem from Figure 6.)
  4. The school adds a fourth fee: library fee, Rs 300, same discounts. Add it. Count how many lines you needed. Compare with how many lines the copy-paste style would have needed โ€” then check your numbers against the cost curve in Figure 5.
  5. Bonus: a classmate suggests also merging tuitionFeeReceipt from another school's program because "it looks the same". Use the Rule of Three, the wrong-abstraction warning, and the quadrant chart in Figure 10 to explain whether that is true duplication or incidental lookalikes.

If your final version changes the sibling discount in exactly one place, you have earned your DRY badge.

Frequently asked questions

What is duplicate code in simple words?
Duplicate code is the same idea written in more than one place. It may be an exact copy-paste, or two blocks that look slightly different but do the same job. The danger is that every future change must now be made correctly in every copy, and missing even one copy creates a bug.
What is the DRY principle?
DRY stands for Don't Repeat Yourself. It comes from the book The Pragmatic Programmer by Andy Hunt and Dave Thomas. It says every piece of knowledge in a system should have exactly one authoritative home. If a rule lives in one place, you change it once and it can never disagree with itself.
What is the Rule of Three?
It is a practical guideline popularized in Martin Fowler's Refactoring book: tolerate the first copy, take note at the second, and refactor when the third appears. By the third occurrence you can clearly see the real shared shape, so the abstraction you extract is more likely to be correct.
Is all repeated-looking code really duplication?
No. Two pieces of code that look alike today but change for different business reasons are only accidentally similar โ€” this is called incidental duplication. Merging them couples unrelated rules together, and later you must add flags and parameters to pull them apart. Duplication is cheaper than the wrong abstraction.
Which refactorings remove duplicate code?
Extract Method for copies inside one class, Pull Up Method for identical methods in sibling subclasses, Form Template Method when steps are the same but details differ, Extract Class when scattered copies hide a shared concept, and Substitute Algorithm when two different-looking blocks do the same job.

Further reading

Related Lessons