Skip to main content
CleanCodeMastery

Primitive Obsession: When Everything Is Just a String or a Number

Primitive Obsession explained simply — why plain strings and numbers hide bugs, and how value objects like Money and Address make code safe and clear.

23 min read Updated June 11, 2026beginner
code smellsprimitive obsessionbloatersrefactoringvalue objectsdomain-driven design

🎯 The Wedding Card That Never Arrived

Mahesh Sharma — "Sharma uncle" to the whole colony — was the happiest man in Jaipur that month. His daughter Priya was getting married, and he was posting invitation cards to two hundred guests. For each envelope, he scribbled the address as one long line:

"Rohan Mehta 14 Lajpat Ngr near old tank Delhi 11024"

See the troubles hiding in that one line? Is "14" the house number or part of the name? Is "old tank" a landmark or a colony? And look closely — the PIN code has only five digits. A PIN code must have six! But nothing on the envelope forces six digits. It is all just one string of ink.

Twenty cards came back undelivered. The postman could not decode them. Twenty guests nearly missed Priya's wedding.

Watching all this was Sharma uncle's nephew, Dev — a young developer who works at TezPost, a courier startup in Bengaluru. Dev helped his uncle re-post the cards using an online form: separate boxes for Name, House No., Street, City, and PIN code — and the PIN box refused to accept anything except exactly six digits. The form caught Sharma uncle's mistake while he was typing, not three weeks later when the card bounced back.

On the train home, Dev could not stop thinking. Last month at TezPost, a customer was charged Rs. 50,000 instead of Rs. 500 — someone had passed paise where the code expected rupees. The week before that, a parcel went to a phone number printed as a PIN code. His uncle's envelopes and his company's bugs were the same disease: important values written as plain, rule-less scribbles.

There is even a money version of the envelope story. A shopkeeper notes "price 500" in his ledger. Five hundred what? Rupees? Paise? One careless assumption and a bill of Rs. 500 becomes a bill of Rs. 5. Plain numbers carry no units, so they happily let us be wrong.

In code, this smell is called Primitive Obsession — the habit of using plain strings and numbers for things that deserve real types with real rules. This lesson follows Dev as he goes back to office, finds the disease in TezPost's code, and cures it with one of the most beautiful ideas in software design: the value object.

💡 What is this smell?

Quick reminder before we define it: a code smell is not a bug. The code compiles, runs, and often gives correct answers. A smell is an early warning that the design will invite bugs and slow down change. Primitive Obsession is one of the "Bloater" smells from Martin Fowler's Refactoring — though this one bloats the code in a sneaky way, by scattering little checks and conversions everywhere instead of growing one big lump.

Primitive Obsession means using raw language primitives — string, number, int, boolean, arrays — to represent rich domain concepts that deserve their own types:

  • a string that is "really" an email address
  • a number that is "really" money in rupees
  • a string that is "really" a PIN code
  • two numbers that are "really" a latitude and longitude

Why is it called an obsession? Because it is a habit we cannot stop. Primitives are always at hand. When you first need to store an email, string works instantly — no design needed. So the primitive ships. And the next developer copies the pattern. And the concept of "email address" never gets a name in your code, even though your whole business depends on it. Dev finds exactly this at TezPost: the word "parcel" appears in every meeting, but the codebase only knows string and number.

💡

A useful slogan from the domain-driven design world: make invalid states unrepresentable. If a PIN code is its own type that checks six digits at creation, then a five-digit PIN simply cannot exist anywhere in your program. The bug is not caught — it is impossible.

The cure has a beautiful name: the value object — a small type defined by its value, which validates itself on creation and carries its own behaviour. Two Money objects of Rs. 50 are equal, just like two fifty-rupee notes are interchangeable. The popular domain-driven-hexagon project guide treats value objects as a core building block of well-designed backends for exactly this reason.

College corner: In Eric Evans' Domain-Driven Design, value objects are one of the three building blocks of a domain model, alongside entities and aggregates. The key distinctions: value objects have no identity (only value-based equality), are immutable, and enforce their invariants at construction. Functional programmers know the same idea as the "parse, don't validate" principle — convert untyped input into a proof-carrying type once, at the boundary, instead of re-checking it forever.

👃 How to spot it

Back at office, Dev runs this checklist over TezPost's codebase. Run it over yours:

  • Variables like string email, string phone, number amount, string status — primitives wearing domain-concept name tags.
  • The same validation repeated in many places: every function that receives a phone number re-checks its length.
  • Magic strings or numbers used as type codes: if (user.plan === "PREMIUM") sprinkled across twenty files.
  • Pairs of primitives that always travel together: an amount with a currencyCode, a latitude with a longitude.
  • Arrays with positional meaning: point[0] is x and point[1] is y — but nothing stops you reading them backwards.
  • Bugs caused by mixed units or swapped values: paise passed where rupees were expected, or two same-typed parameters interchanged silently.
SymptomWhat it tells you
string email, string pin, number priceDomain concepts exist in your head but not in your type system
Same regex check in five filesValidation has no home, so it is copy-pasted and will drift apart
if (type === "GOLD") everywhereA type code is begging to become a real type or enum
amount + currency always side by sideA Money value object is trying to be born
data[3] means "discount" by conventionA positional array is hiding a structured record
A paise/rupee or cm/inch mix-up bugNumbers without units let physics go wrong silently

Dev's audit of TezPost finds the PIN-length check copied in five files — and they already disagree: four check length === 6, one also rejects PINs starting with zero. Which one is right? Nobody remembers.

⚠️ Why it is a problem

  1. Validation gets duplicated — then drifts. The six-digit PIN rule must be checked at every entry point, because a plain string guarantees nothing. Sooner or later, one copy of the rule is updated and four are not. TezPost is living this right now.
  2. The compiler cannot protect you. sendCard(pinCode, phone) and sendCard(phone, pinCode) look identical to the compiler when both are strings. With distinct types, swapping them is a compile error — a bug caught before the program even runs.
  3. The domain language disappears. Your business talks about Money, Addresses, and PIN codes; your code talks about string and number. New developers must rebuild the meaning in their heads, file by file.
  4. Behaviour has no home. Where does "add two money amounts safely" live? With primitives, the answer is "in seventeen helper functions across the project." With a Money type, the answer is "in Money."
  5. Invalid values travel freely. A negative quantity, a malformed email, a five-digit PIN — primitives accept them all and pass them deep into the system, where they explode far from their source.

And this is not just classroom theory. In 1999, NASA lost the Mars Climate Orbiter — a spacecraft costing over $300 million — because one piece of ground software produced thruster data in pound-seconds while the navigation software expected newton-seconds. Both sides were exchanging plain numbers. The numbers had no units attached, so nothing could notice the mismatch, and the spacecraft entered the Martian atmosphere too low and was destroyed. A force type that carried its unit would have refused the mix-up. That is Primitive Obsession at planetary scale.

Figure 1: How one innocent primitive grows into a system-wide problem

When Dev categorises three months of TezPost's production bugs, the result convinces his manager instantly. More than half trace back to rule-less primitives:

Figure 2: Three months of TezPost bugs, sorted by root cause — primitives dominate

There is also a quieter cost: every module that handles a raw value must re-validate it, so the number of scattered checks grows with the codebase itself:

Figure 3: Copies of the same validation rule as the codebase grows — without a type, checks multiply

With a PinCode type, that line stays flat at exactly one, forever.

College corner: The drift problem is a violation of the Single Source of Truth principle applied to invariants. Type-system researchers call the cure "newtypes" or "branded types": zero-cost wrappers whose only job is to make two same-shaped values incompatible. TypeScript fakes this with branded intersection types, Haskell and Rust support it natively, and C# gets it with record struct. The runtime representation is often identical — the safety is purely compile-time, which means it is free.

🧪 Dev chases the paise bug — a live demo of the pain

Before fixing anything, Dev re-traces last month's Rs. 50,000 disaster to show his team where the smell bites:

Figure 4: The anatomy of the paise bug — a unitless number crosses three layers before exploding

Four layers handled the number. None of them could know it was paise, because a number carries no unit. The bug exploded at the customer's bank account — the place furthest from its source. Dev's debugging week looked like this:

Figure 5: Dev's week chasing one unitless number through the system

Notice the saddest row: the fix itself was easy, but Dev had to paste guard checks into four places — planting the seeds of the next drift bug. Patching a primitive does not cure the disease; it spreads it politely.

📊 Which values deserve a type?

"So should I wrap every string in the project?" asks Dev's junior teammate. No! Wrapping everything is its own kind of noise. Dev draws two axes on the whiteboard: does the value have rules, and how far does it travel through the system?

Figure 6: Wrap decision — values with rules that travel far earn a type first

Money and PIN codes sit deep in "Wrap it now": heavy rules, travelling through every module. A loop counter has neither rules nor reach — leave it alone.

🧪 A real-life code example

Let us look at the TezPost code Dev found — Sharma uncle's problem, in TypeScript. The courier app stores everything as primitives:

// Everything is a primitive. What could go wrong?
function createParcel(
  recipientName: string,
  address: string,          // one long line, like the envelope
  pinCode: string,          // hopefully six digits?
  codAmount: number,        // rupees? paise? who knows
  phone: string,
): Parcel {
  // every function must re-validate everything
  if (pinCode.length !== 6) throw new Error("Bad PIN");
  if (codAmount < 0) throw new Error("Bad amount");
  return { recipientName, address, pinCode, codAmount, phone };
}
 
function printLabel(p: Parcel): string {
  return p.recipientName + ", " + p.address + " - " + p.pinCode;
}
 
function chargeCod(p: Parcel, deliveryFee: number): number {
  // is deliveryFee in rupees or paise? the last bug was exactly this
  return p.codAmount + deliveryFee;
}
 
// Call site - spot the two bugs the compiler happily accepts:
const parcel = createParcel(
  "Rohan Mehta",
  "14 Lajpat Ngr near old tank Delhi",
  "98113",                 // five digits - runtime error at best
  50000,                   // meant Rs. 500.00 entered as paise
  "110024",                // phone and PIN swapped? both are strings!
);

Walk through the pain, exactly as Dev walked his team through it:

  • The address is one string, so the app can never reliably extract the city for sorting, or check that a PIN exists at all. Just like the postman with Sharma uncle's envelope, our code must guess.
  • The PIN check lives inside createParcel — but updateAddress, importParcels, and the API endpoint must each repeat it. Five copies of one rule, already drifting.
  • codAmount is a bare number. One developer thinks rupees, another thinks paise. The customer is charged a hundred times too much, or too little.
  • The compiler cannot see that "110024" was placed in the phone slot. Both are strings; both fit.
🚨

Two parameters of the same primitive type, side by side, are a transposition trap. The compiler will never catch the swap. Distinct types turn this runtime mystery into a compile-time error.

🛠️ Cleaning it up, step by step

Dev's main tool is Replace Data Value with Object: promote a bare primitive into a small class that owns its rules.

Step 1: Give the PIN code a type. The rule moves to exactly one place — the constructor — and an invalid PIN becomes impossible to create.

class PinCode {
  private constructor(readonly value: string) {}
 
  static of(raw: string): PinCode {
    const cleaned = raw.trim();
    if (!/^[1-9][0-9]{5}$/.test(cleaned)) {
      throw new Error("PIN code must be exactly 6 digits: " + raw);
    }
    return new PinCode(cleaned);
  }
}

From this moment, any function that receives a PinCode knows it is valid. No re-checking, anywhere, ever. The five drifting copies at TezPost collapse into this one constructor — and the team finally settles the "leading zero" debate in a single code review.

Step 2: Give money a type that knows its unit. We store paise internally (whole numbers avoid decimal rounding troubles) but speak rupees at the edges.

class Money {
  private constructor(readonly paise: number) {}
 
  static fromRupees(rupees: number): Money {
    if (rupees < 0) throw new Error("Money cannot be negative");
    return new Money(Math.round(rupees * 100));
  }
 
  add(other: Money): Money {
    return new Money(this.paise + other.paise);
  }
 
  toString(): string {
    return "Rs. " + (this.paise / 100).toFixed(2);
  }
}

The rupee/paise confusion is now structurally impossible: there is no way to hand raw paise to code expecting rupees, because both sides only ever exchange Money. The Rs. 50,000 bug can never happen again — not because everyone is careful, but because carelessness no longer compiles.

Step 3: Replace the one-line address with a structured type, and assemble. The address graduates from a scribbled envelope line to a proper form. (Grouping street + city + PIN into one type is also a preview of Introduce Parameter Object — and of our Data Clumps lesson.)

class Address {
  constructor(
    readonly houseNo: string,
    readonly street: string,
    readonly city: string,
    readonly pin: PinCode,
  ) {}
 
  label(): string {
    return `${this.houseNo}, ${this.street}, ${this.city} - ${this.pin.value}`;
  }
}
 
function createParcel(
  recipientName: string,
  address: Address,
  codAmount: Money,
  phone: PhoneNumber,
): Parcel {
  // nothing to validate here - every input validated itself at birth
  return { recipientName, address, codAmount, phone };
}
 
function chargeCod(p: Parcel, deliveryFee: Money): Money {
  return p.codAmount.add(deliveryFee); // units can never mix
}

Compare the before and after. createParcel has no validation left — not because we got careless, but because invalid inputs can no longer reach it. The phone/PIN swap is now a compile error. The label-printing logic lives on Address, where it belongs. Each rule exists in exactly one place.

Here is the new design, drawn as the diagram Dev presents at the team demo:

Figure 7: The refactored parcel — every concept is a self-validating value object

Two more tools complete the kit. When a magic string like "PREMIUM" selects behaviour, use Replace Type Code with Class (or its subclass/strategy cousins). When a positional array like point[0], point[1] pretends to be a record, use Replace Array with Object.

Figure 8: Validation scattered everywhere versus validation owned by value objects

College corner: Value objects should be immutable: once constructed, never changed. Immutability is what makes value-based equality safe — a mutable "value" used as a dictionary key can silently corrupt the dictionary when it changes. It also makes value objects thread-safe for free and lets methods like add return new instances instead of mutating, which is exactly how Money.add above works.

🔄 The life cycle of this smell

Primitive Obsession follows a recognisable arc in every codebase. The dangerous part is the middle, where the primitive becomes "the convention":

Figure 9: The life cycle of a primitive — from quick convenience to codebase convention, and the way back

Note the trap loop: Bitten back to Obsessed. Patching the symptom with one more scattered check feels responsible, but it deepens the obsession. The exit is the boundary rewrite: parse raw input into value objects once, at the edge, and let only typed values travel inward.

🧰 The same smell in C#

C# makes value objects delightfully short with record types. Here is the smelly version of a school fee system:

public void RecordFeePayment(string studentId, decimal amount, string currency)
{
    if (amount <= 0) throw new ArgumentException("Bad amount");
    if (currency != "INR") throw new ArgumentException("Only INR supported");
    // ... save payment
}

And the clean version, where Rupees guards itself:

public readonly record struct Rupees
{
    public decimal Amount { get; }
 
    public Rupees(decimal amount)
    {
        if (amount <= 0)
            throw new ArgumentException("Amount must be positive");
        Amount = amount;
    }
 
    public static Rupees operator +(Rupees a, Rupees b) => new(a.Amount + b.Amount);
    public override string ToString() => $"Rs. {Amount:N2}";
}
 
public void RecordFeePayment(StudentId studentId, Rupees amount)
{
    // nothing to check - a Rupees value is valid by construction
    // ... save payment
}

A record struct gives value-based equality for free — two Rupees(500) values are equal, exactly as two five-hundred-rupee notes are. The .NET community has written extensively about this pattern as the standard cure for primitive obsession. Note the StudentId type too: even an ID with no format rules earns a wrapper, purely so it can never be confused with an OrderId.

🔍 Where this smell hides in real projects

  • IDs as raw strings or ints. userId, orderId, and productId all typed as string — until the day someone passes an order ID into a user lookup. Strongly-typed ID wrappers are a famous cure in DDD-style codebases.
  • Money as double or float. Doubly dangerous: no currency attached and binary floating point cannot represent 0.1 exactly, so paise leak in long calculations. Finance code uses decimal-based Money types for both reasons.
  • Status and type codes as strings. "ACTIVE", "PENDING", "premium" versus "PREMIUM" — one casing mistake and a comparison silently fails.
  • Dates and durations as numbers. Is timeout = 30 seconds or milliseconds? Entire bug categories live in that question; modern APIs pass Duration/TimeSpan types instead.
  • API boundaries and DTOs. Data arrives from JSON as strings and numbers — fine at the very edge, but smelly when those raw values are allowed to wander deep into business logic without being promoted to domain types. Well-structured projects (see the domain-driven-hexagon guide) convert primitives to value objects at the boundary.
  • Scientific and engineering software. The Mars Climate Orbiter story is the canonical warning: unit-less numbers crossing a team boundary destroyed a spacecraft.
ℹ️

Your type checker is the cheapest test suite you will ever own. Every swap, unit mix-up, and invalid value that a value object prevents is a unit test you never have to write, run, or maintain.

🤔 When it is okay to ignore

SituationIgnore the smell?Why
Loop counters, array indexes, temporary flags✅ YesNo rules, no domain meaning — wrapping adds pure ceremony
A value with validation used across many modules❌ NoCentralising the rule pays for itself many times over
Tiny script or throwaway prototype✅ YesThe design will not live long enough to collect the payoff
Money, IDs, emails, phone numbers in a long-lived system❌ NoThese are the classic, highest-payoff candidates
Performance-critical inner loop, after measuring✅ SometimesRarely, wrapper allocation matters — but structs/records usually cost nothing
Value used once, in one function, with no rules✅ YesA wrapper for a value with no invariants protects nothing

The honest rule: wrap a primitive when the value has rules to enforce, behaviour to own, or a meaning worth naming. The payoff grows with how far the value travels through your system — which is exactly what the quadrant in Figure 6 plots.

💊 Which refactorings cure it

RefactoringWhen to use it
Replace Data Value with ObjectThe main cure — promote a bare primitive into a value object owning its validation and behaviour
Introduce Parameter ObjectPrimitives that always travel together (amount + currency) become one type
Replace Type Code with ClassMagic strings or ints that select a category become a real type
Replace Array with ObjectPositional arrays (p[0] is x, p[1] is y) become objects with named fields
Extract ClassSeveral related primitives in one class become a structured component like Address

🧠 The whole smell on one page

Dev's final whiteboard sketch, photographed by half the team:

Figure 10: Primitive Obsession at a glance — signs, causes, costs, and cures

📦 Quick revision box

+------------------------------------------------------------------+
|              PRIMITIVE OBSESSION - CHEAT SHEET                   |
+------------------------------------------------------------------+
| What     : Raw strings/numbers playing the role of rich          |
|            concepts (the address scribbled on one line)          |
| Family   : Bloaters                                              |
| Spot it  : string email, repeated validation, magic type         |
|            codes, unit mix-ups, positional arrays                |
| Costs    : Duplicated rules, no compiler safety, lost            |
|            domain language, invalid states travel freely         |
| Main fix : Replace Data Value with Object (value objects)        |
| Helpers  : Introduce Parameter Object, Replace Type Code,        |
|            Replace Array with Object                             |
| Ignore   : Loop counters, throwaway scripts, rule-free values    |
| Mantra   : "Make invalid states unrepresentable."                |
+------------------------------------------------------------------+

✍️ Practice exercise

Dev's homework for his juniors — and for you. This little ticket-booking function is drowning in primitives. Rescue it!

function bookTicket(
  passengerName: string,
  age: number,
  from: string,        // station code like "NDLS"
  to: string,          // station code like "BCT"
  farePaise: number,   // careful - paise, not rupees!
  mobile: string,
): string {
  if (age < 0 || age > 120) throw new Error("Bad age");
  if (from.length !== 4 || to.length !== 4) throw new Error("Bad station code");
  if (mobile.length !== 10) throw new Error("Bad mobile");
  if (farePaise < 0) throw new Error("Bad fare");
 
  const discounted = age >= 60 ? farePaise * 0.6 : farePaise;
  return passengerName + ": " + from + " -> " + to + ", Rs." + discounted / 100;
}
 
// Spot the danger at this call site:
bookTicket("Meera", 65, "BCT", "NDLS", 145000, "9876543210");
// Did the caller mean Rs. 1450 or Rs. 14.50? And are from/to in the right order?

Your tasks:

  1. Create value objects: Age, StationCode, Money (with a fromRupees factory), and MobileNumber. Each validates itself in its constructor.
  2. Rewrite bookTicket to accept these types. How many lines of validation remain inside it?
  3. Move the senior-citizen discount into the Money type or a Fare type as a method like withDiscount(percent).
  4. Bonus thought: from and to are both StationCode — types alone cannot stop a swap between them. What could? (Hint: a Route object with named fields — which is really a small Data Clump being fixed. That is our topic two lessons from now.)

If your final bookTicket contains zero validation lines and reads like plain English, you have escaped the obsession — and unlike Sharma uncle's twenty bounced envelopes, your parcels will always reach.

Frequently asked questions

What exactly is a 'primitive'?
A primitive is a basic built-in type of the language: string, number, boolean, int, decimal, arrays of these, and so on. They are the raw clay of programming. The smell appears when we use raw clay for things that deserve a proper shape, like an email address, money, or a phone number.
Should I wrap every single value in a class?
No! A loop counter should stay an int. Wrap a value only when it has rules (validation), behaviour (operations), or a real meaning in your domain. A Money amount that flows through pricing, tax, and billing earns its type many times over. A temporary index does not.
What is a 'value object'?
A value object is a small type defined by its value, not by an identity. Two Money objects of Rs. 50 are equal and interchangeable, just like two fifty-rupee notes. Value objects validate themselves on creation, so an invalid one can never exist.
Does wrapping primitives slow the program down?
The cost is tiny and usually invisible. Many languages offer cheap wrappers — C# structs and records, TypeScript types and classes. The bugs prevented (like mixing rupees with paise) cost far more than a few nanoseconds.
What real disaster is linked to this smell?
NASA's Mars Climate Orbiter was lost in 1999 partly because one piece of software produced thrust data in pound-seconds while another expected newton-seconds — plain numbers with no units attached. A unit-carrying type would have caught the mismatch. The mission cost over 300 million dollars.

Further reading

Related Lessons