Skip to main content
CleanCodeMastery

Data Class: The Register With No Rules — Anyone Can Scribble Anything

Learn the Data Class smell with a society register story. See why data without behavior breaks encapsulation, and when DTOs and records are perfectly fine.

21 min read Updated June 11, 2026beginner
code-smellsdispensablesdata-classanemic-domain-modelencapsulationrefactoringtypescriptcsharp

📓 The register where anyone scribbles anything

Sagar Apartments in Mumbai keeps a visitor register at the gate. It is just a plain notebook with columns drawn in ballpoint pen: name, flat number, time in, time out. No rules. No watchman checking entries — the watchman, Ganpat bhai, is usually helping someone park. The pen hangs on a string, and whoever comes writes whatever they like.

The society secretary, Mr. Kulkarni, is proud of this register. "Full record of everyone who enters," he tells the committee. "Complete security."

After one month, look inside the register:

  • One visitor wrote his name as "guest". Just "guest".
  • Someone entered flat number 1403 — the building has only 8 floors.
  • A delivery boy wrote his time-in as 25:70.
  • One full row is blank except a doodle of a cat.
  • A teenager named Rohan went back and quietly changed his friend's exit time from 11 pm to 9 pm so nobody would know they came back late from the cricket match.

Then, one Tuesday, a bicycle goes missing from the parking. Mr. Kulkarni marches to the gate, opens the register to check who visited that evening — and finds it useless. "guest" visited flat 1403 at 25:70. The data is garbage, because the register had no rules. It would happily accept anything, from anyone, at any time, and let anyone change anything afterwards. The notebook stored data; it never protected data.

Figure 1: One month in the life of the rule-less register

Compare this with the bank just down the road. Try writing "guest" as your name on a bank form. Try depositing to account 1403 when accounts have 11 digits. The clerk stops you instantly, at the counter, before the wrong data ever enters the books. The bank's "register" has a guardian who enforces rules at the moment of writing — so the stored data can always be trusted. That is the whole difference between the two notebooks: not the paper, the guardian.

In code, a class that is only fields plus getters and setters — and no rules, no behavior, no guardian — is the visitor notebook. This is the Data Class smell.

🤔 What is this smell?

A Data Class is a class that holds fields and accessors but no behavior. It cannot validate itself, calculate anything about itself, or protect itself. All the thinking about its data is done by other classes, scattered across the codebase — forcing everyone else to know its rules and to repeat them.

Martin Fowler describes a system-wide version of this smell on his bliki: the Anemic Domain Model. The objects look like a real domain model — Order, Customer, Invoice — but they are bloodless. Just bags of getters and setters. All the logic lives in big procedural "service" classes that pull data out of the bags, think for them, and stuff results back in. Fowler calls this an anti-pattern because it pays all the costs of object-orientation and collects none of its benefits.

The principle being broken has a memorable name: Tell, Don't Ask. Healthy object design says: tell the object what you want (order.total()), and let it use its own data. Anemic design instead asks for the raw data (order.lines, order.discountRate) and computes outside — in five different places, five slightly different ways.

💡

A class is the natural guardian of its own data. The moment data and the rules about that data live in the same class, the rules are enforced in exactly one place and can never be bypassed. The moment they live apart, every caller becomes responsible for remembering the rules — and someone always forgets.

One thing right away, because it matters: not every behavior-free class is a smell. DTOs, records, and view models are meant to be plain data, and we will give them their full honest defense later in this post. The smell is specifically a domain object that should own behavior but does not.

College corner: The deep idea here is the invariant — a condition that must hold for an object during its whole lifetime ("discount is between 0 and 1", "exit time is after entry time"). Encapsulation, in its serious meaning, is not "make fields private and add getters" — it is making invariants impossible to break from outside. A data class has private fields but zero invariants, so it is encapsulated only in syntax, not in substance. In Domain-Driven Design this becomes the aggregate pattern: an aggregate root (like Order) is the single entry point that guards the invariants of everything inside it. The visitor register with a clerk is an aggregate; the notebook on a string is not.

Here is the territory in one map:

Figure 2: The full map of the Data Class smell

🔍 How to spot it

Checklist for your code review:

  • A class with only public fields, or private fields wrapped in mechanical getters and setters that any IDE could generate.
  • Logic that operates on the class's data lives everywhere except inside the class.
  • The same validation or calculation on this class's fields is repeated in several callers.
  • Collections handed out raw through getters, so outsiders can add or remove items behind the class's back.
  • The class travels through the whole codebase, but only ever to have its fields read or poked.
QuestionSmelly answerHealthy answer
Who computes the order total?Every caller, separatelyorder.total(), once
Who checks the discount is between 0 and 1?Hopefully someone, somewhereThe applyDiscount method, always
Can an outsider empty the lines list?Yes — order.lines is the real listNo — read-only view + addLine()
Can the object ever hold nonsense values?Yes, any field, any valueNo — guarded at every entry point
Where do I read to learn the rules?Every caller in the codebaseThe class itself, one file

A useful sorting tool: place any data-holding class on this chart. The danger zone is "has real rules to guard" plus "guards nothing".

Figure 3: Which data-holding classes are actually the smell?

⚠️ Why it is a problem

Problem 1: Encapsulation collapses. When anyone can set any field to any value, the class cannot protect its own correctness. A -500 rupee price, a flat 1403, a 25:70 time — the class accepts them all silently. Correctness now depends on every caller remembering every rule, forever.

Problem 2: Rules get duplicated. The "discount must be 0 to 1" rule gets written in the order screen, the admin screen, and the import job. Three copies. When the rule changes to "maximum 0.5", you must find all three — this is the Duplicate Code smell being born directly from the Data Class smell.

Problem 3: The data has no single explanation. To understand what discountRate means and how it may legally change, you must read every place that touches it. With a behavior-owning class, you read one file.

Problem 4: Leaked internals invite back-stabbing. A getter that returns the real internal list lets any caller do order.lines.clear() from anywhere. The order is corrupted, and the stack trace points nowhere near the culprit. Rohan editing his friend's exit time is exactly this: write access to internals, no audit, no guard.

Problem 5: It feeds Feature Envy. The behavior that should live on the data class must live somewhere — so it squats in services and helpers, enviously poking at the data class's fields all day. Data Class and Feature Envy are two sides of the same coin.

Watch the moment garbage enters, in slow motion. Notice that the object never objects:

Figure 4: The anemic object accepts everything; the bug is found much later, far away

The damage also grows with the number of places that touch the data. Each new caller is one more place where a rule can be forgotten:

Figure 5: More callers touching raw fields means more rule copies and more drift

With a rich class, that line stays flat at 1 — forever, no matter how many callers arrive. That flat line is the entire argument for this refactoring.

Figure 6: Anemic order — every caller carries the rules; the class carries nothing

💻 A real-life code example

The society finally digitizes its visitor register. The first version copies the notebook faithfully — including its lawlessness.

// Smelly version: the digital notebook with no rules
class VisitorEntry {
  name = "";
  flatNumber = 0;
  inTimeMinutes = 0;   // minutes since midnight
  outTimeMinutes = 0;
}
 
class VisitorRegister {
  entries: VisitorEntry[] = [];
}
 
// gate screen, somewhere:
const e = new VisitorEntry();
e.name = prompt("Name?") ?? "";
e.flatNumber = Number(prompt("Flat?"));
e.inTimeMinutes = nowInMinutes();
register.entries.push(e);
 
// security report, in another file:
function visitDuration(e: VisitorEntry): number {
  return e.outTimeMinutes - e.inTimeMinutes;  // negative if out < in!
}
 
// admin panel, in a third file:
function isStillInside(e: VisitorEntry): boolean {
  return e.outTimeMinutes === 0;   // "0 means not exited"... says who?
}
 
// and a prank, from anywhere at all:
register.entries.length = 0;        // entire register wiped, silently

Every notebook disaster is now possible in code:

  1. e.name = "" — the "guest"/blank-row problem. Nobody checks.
  2. e.flatNumber = 1403 — flats go 101 to 804, but the class accepts anything.
  3. visitDuration can go negative; isStillInside invents a secret rule ("0 means not exited") that lives only in one caller's head.
  4. register.entries is the real array, so anyone can wipe it — Rohan editing exit times, now with one line of code.
  5. Each caller carries its own private understanding of the rules. They already disagree.

🧹 Cleaning it up, step by step

We hire a guardian. Step by step, the notebook becomes a bank register.

Step 1: Encapsulate Field. Make fields private and force all writing through methods that check the rules. Construction itself should refuse garbage.

Step 2: Move Method. visitDuration and isStillInside use only VisitorEntry's data — classic Feature Envy. Move them home, onto the class.

Step 3: Encapsulate Collection. The register exposes a read-only view and offers checkIn/checkOut methods. The raw array becomes untouchable.

// Clean version: the register now has a guardian
class VisitorEntry {
  private outTime: number | null = null;
 
  constructor(
    private readonly name: string,
    private readonly flatNumber: number,
    private readonly inTime: number,
  ) {
    if (name.trim().length < 2) throw new Error("Real name required");
    if (!isValidFlat(flatNumber)) throw new Error(`No such flat: ${flatNumber}`);
    if (inTime < 0 || inTime >= 1440) throw new Error("Invalid time");
  }
 
  checkOut(outTime: number): void {
    if (this.outTime !== null) throw new Error("Already checked out");
    if (outTime < this.inTime) throw new Error("Exit before entry? No.");
    this.outTime = outTime;
  }
 
  isStillInside(): boolean {
    return this.outTime === null;          // the rule, stated once, clearly
  }
 
  visitDurationMinutes(): number | null {
    return this.outTime === null ? null : this.outTime - this.inTime;
  }
}
 
class VisitorRegister {
  private readonly entries: VisitorEntry[] = [];
 
  get allEntries(): ReadonlyArray<VisitorEntry> {
    return this.entries;                   // a view, not the real thing
  }
 
  checkIn(name: string, flatNumber: number, inTime: number): VisitorEntry {
    const entry = new VisitorEntry(name, flatNumber, inTime);
    this.entries.push(entry);
    return entry;
  }
}

Look what changed:

  • A VisitorEntry with a blank name or flat 1403 cannot exist. The constructor is the clerk at the bank counter.
  • "Still inside" has exactly one definition — outTime === null — written once, inside the class, instead of a secret 0 convention in some caller.
  • A negative duration is impossible; exit-before-entry is rejected at the door.
  • register.entries.length = 0 no longer compiles. The prankster is out of business.
  • Callers now follow Tell, Don't Ask: they ask entry.visitDurationMinutes() instead of pulling fields and computing.

The structure after the operation:

Figure 7: After the refactor — data and its rules finally live in the same class

There is also a nice way to see the entry itself as a tiny machine. The rich class makes illegal jumps impossible:

Figure 8: A visitor entry as a state machine — the rich class allows only legal moves

In the anemic version, every one of those "rejected" arrows was an open door.

🟦 The same smell in C#

The classic anemic order, exactly as it appears in a thousand real codebases:

// Before: anemic data holder; callers do its thinking
public class Order
{
    public List<OrderLine> Lines { get; set; }
    public decimal DiscountRate { get; set; }
}
 
// far away, in some service:
decimal total = 0;
foreach (var line in order.Lines)
    total += line.UnitPrice * line.Quantity;
total -= total * order.DiscountRate;   // duplicated wherever a total is needed

Move the behavior home and lock the doors:

// After: the class owns the rules about its own data
public class Order
{
    private readonly List<OrderLine> _lines = new();
 
    public IReadOnlyList<OrderLine> Lines => _lines;     // no outside mutation
    public decimal DiscountRate { get; private set; }
 
    public void AddLine(OrderLine line) => _lines.Add(line);
 
    public void ApplyDiscount(decimal rate)
    {
        if (rate is < 0 or > 1)
            throw new ArgumentOutOfRangeException(nameof(rate));
        DiscountRate = rate;
    }
 
    public decimal Total()
    {
        var subtotal = _lines.Sum(l => l.UnitPrice * l.Quantity);
        return subtotal - subtotal * DiscountRate;
    }
}
 
// every caller, everywhere:
decimal total = order.Total();

The total rule exists once. An illegal discount cannot be set. Nobody can clear the lines behind the order's back. This journey — from anemic to rich — is the heart of domain-driven design.

And in Python, where @dataclass makes plain data easy — which is wonderful at boundaries and risky in the domain:

# Fine as a boundary DTO: plain by design
from dataclasses import dataclass
 
@dataclass(frozen=True)
class VisitorSummaryDto:
    name: str
    flat_number: int
 
# Rich in the domain: the guardian pattern
class VisitorEntry:
    def __init__(self, name: str, flat_number: int, in_time: int):
        if len(name.strip()) < 2:
            raise ValueError("Real name required")
        if not is_valid_flat(flat_number):
            raise ValueError(f"No such flat: {flat_number}")
        self._name = name
        self._flat = flat_number
        self._in_time = in_time
        self._out_time: int | None = None
 
    def check_out(self, out_time: int) -> None:
        if self._out_time is not None:
            raise ValueError("Already checked out")
        if out_time < self._in_time:
            raise ValueError("Exit before entry? No.")
        self._out_time = out_time

College corner: Notice the architectural pattern hiding in that Python snippet: plain at the boundary, smart at the core. In hexagonal/clean architecture terms, DTOs live in the adapter layer (they mirror JSON, database rows, message formats), while invariant-guarding entities live in the domain layer. CQRS pushes this further: write-side models are rich (they must guard invariants during changes), while read-side projections are deliberately anemic (they only ever display). So "is a data class a smell?" has a precise architectural answer: it depends which layer you are standing in. The same shape that is correct in an adapter is a disease in the domain.

🏢 Where this smell hides in real projects

  • Layered "enterprise" architectures taken too far. The culture of "entities are just data; all logic goes in the service layer" mass-produces anemic models. The service layer swells into thousand-line procedural scripts while entities stay bloodless.
  • ORM entities used as the domain model. Database mapping tools historically wanted public getters and setters on everything, training a generation of developers to hollow out their entities and never look back.
  • IDE-generated accessor reflex. Create fields, press the generate-getters-setters shortcut, done. The class is born anemic, and behavior gets written wherever the developer happens to be standing.
  • Exposed collections. public List<Student> Students { get; set; } — every consumer can add, remove, clear, or replace the whole list. Invariants like "a section has at most 40 students" become unenforceable.
  • Validation living only in the UI. The form checks the rules, the domain object accepts anything. Then an import job, a message consumer, or a second UI writes directly — and garbage enters through the side door, exactly like the import job in Figure 4.

When teams audit where their anemic classes came from, the blame usually splits like this:

Figure 9: Why domain classes end up anemic in real teams

⚖️ When it is okay to ignore

This is the most important honesty section in this post. Plain data classes are sometimes exactly the right design. Calling every one of them a smell is a beginner's mistake.

Kind of classSmell?Why it is fine (or not)
DTO crossing a boundary (API payload, queue message)NoIts whole job is to be a transparent shape that maps to JSON or a wire format
C# record / Java record / Python @dataclassNoLanguage-blessed immutable value bundles; adding ceremony fights the language
Read model / view model / CQRS projectionNoDeliberately flat, query-shaped data for display or reporting
Configuration objectsNoSettings are data by nature
Functional-programming style records + pure functionsNoIn FP, immutable data plus separate functions is the intended design
Domain entity with rules, hollowed into getters/settersYesIt should guard invariants and own calculations, but cannot
"Domain" object whose rules are copied across callersYesThe duplication and drift prove the behavior belongs inside

How to tell a healthy DTO from a sick domain object? Ask: does this data have rules and invariants that must always hold?

  • An OrderResponseDto going out as JSON has no rules to defend — it is a photograph of data, frozen and outbound. Plain is perfect.
  • The Order inside your domain has rules — "discount between 0 and 1", "total is computed this way", "lines cannot be mutated from outside". If it cannot defend them, it is anemic.
⚠️

Do not "fix" your DTOs by stuffing business logic into them. A DTO with behavior is its own mess — now your wire format and your business rules change together. The correct shape of many systems is: rich domain objects in the middle, thin DTOs at the edges, and mapping between them. Plain at the boundary, smart at the core.

🛠️ Which refactorings cure it

SymptomCuring refactoringResult
Behavior on this data lives in other classesMove MethodLogic relocates to the class that owns the data
Naked public fieldsEncapsulate FieldWrites pass through guarded methods
Raw collections handed outEncapsulate CollectionRead-only view plus add/remove methods
Callers repeat the same getter-then-calculate danceExtract Method + Move MethodThe dance becomes one method on the class
Setters that should never existRemove Setting MethodImmutable after construction

A practical hunting tactic from Fowler's catalog: look at the callers of each getter. If several callers take the value and perform the same calculation on it, that calculation is begging to move into the class as a method. Follow the getters; they lead you to the missing behavior.

Figure 10: The cure path — from notebook to bank register

📦 Quick revision box

+--------------------------------------------------------------+
|  DATA CLASS — QUICK REVISION                                 |
+--------------------------------------------------------------+
|  Story   : A visitor register with no rules — anyone         |
|            scribbles anything, so the data can't be trusted. |
|  Smell   : A DOMAIN class with fields + getters/setters      |
|            but no behavior; others do its thinking.          |
|  Why bad : Cannot guard its invariants; rules duplicated     |
|            across callers; internals leak; data goes bad.    |
|  Principle: Tell, Don't Ask — ask order.total(), don't       |
|            pull fields and compute outside.                  |
|  NOT smell: DTOs, records, view models, config objects —     |
|            plain-by-design data at boundaries is GOOD.       |
|  Cures   : Move Method, Encapsulate Field,                   |
|            Encapsulate Collection, Remove Setting Method.    |
|  Motto   : Plain at the boundary, smart at the core.         |
+--------------------------------------------------------------+

✏️ Practice exercise

A library management program has an anemic heart. Operate on it.

class LibraryBook {
  title = "";
  timesIssued = 0;
  isIssued = false;
  dueDateDay = 0;        // day of month; 0 means "no due date"
}
 
class Library {
  books: LibraryBook[] = [];
}
 
// in the issue-desk screen:
function issueBook(b: LibraryBook, today: number): void {
  b.isIssued = true;
  b.timesIssued = b.timesIssued + 1;
  b.dueDateDay = today + 14;          // can become 36 if today is 22!
}
 
// in the fine-counter screen:
function fineFor(b: LibraryBook, today: number): number {
  return (today - b.dueDateDay) * 2;  // negative fine if returned early!
}
 
// in the reports screen:
function isOverdue(b: LibraryBook, today: number): boolean {
  return b.isIssued && today > b.dueDateDay && b.dueDateDay !== 0;
}

Your tasks:

  1. List every rule about a book's data that currently lives outside LibraryBook. (Hint: there are at least four, including the secret "0 means no due date" convention.)
  2. Find two real bugs already present in the callers. (Look at the due-date arithmetic and the fine calculation.)
  3. Refactor: move issue, fineFor, and isOverdue into LibraryBook using Move Method. Make the fields private. Fix both bugs while moving — the class is now responsible for its own correctness.
  4. Replace the secret 0 convention with something honest (a dueDateDay: number | null). Note how the class can now hide this detail completely from callers.
  5. Protect Library.books with Encapsulate Collection: a read-only view plus an addBook method.
  6. Draw the state machine of a book (like Figure 8): available → issued → returned. Mark which illegal jumps your new class now rejects.
  7. Finally, design a BookSummaryDto with title and timesIssued for the library's public website API — completely behavior-free. Write one sentence explaining why this plain class is not the Data Class smell.

When your LibraryBook can no longer hold impossible data — and your DTO is proudly, correctly plain — you have mastered both halves of this lesson, and Mr. Kulkarni's bicycle thief would have been caught.

Frequently asked questions

What is a Data Class smell in simple words?
It is a class that only holds fields with getters and setters but has no behavior of its own. All the thinking about its data — validation, calculation, rules — is done by other classes far away. The data and its rules live apart, even though they always change together.
Are DTOs and records also Data Class smells?
No. DTOs carry data across a boundary like an API or a message queue, and their whole job is to be a plain, transparent shape. Records and dataclasses are language-blessed ways to model immutable value bundles. The smell is only a DOMAIN object that should own its rules but has been hollowed out into a bag of getters and setters.
What is an anemic domain model?
It is Martin Fowler's name for a design where domain objects look real but contain no behavior — just data — while all logic sits in procedural service classes. It looks object-oriented from far away, but it loses the main benefit of objects: keeping data and the operations on that data together.
What does Tell, Don't Ask mean?
Instead of asking an object for its data and doing the calculation yourself, tell the object what you want and let it do its own thinking. Ask order.total() instead of pulling out lines and discount and computing the total in five different callers.
Which refactorings cure a Data Class?
Move Method brings the behavior that operates on the data into the class that owns the data. Encapsulate Field replaces naked public fields with controlled access. Encapsulate Collection stops outsiders from mutating internal lists by exposing read-only views with add and remove methods.

Further reading

Related Lessons