Most Python developers use dataclasses as a fancy way to avoid writing __init__. That’s like using a Swiss Army knife to open letters. Dataclasses have deep features for validation, computed fields, immutability, memory optimization, and serialization that can replace entire utility libraries when used correctly.
⚡ TL;DR: dataclasses aren’t just
__init__generators. Usefield()for defaults and metadata,__post_init__for validation,frozen=Truefor immutability,slots=Truefor memory efficiency, andClassVarfor class-level data. Together they replace attrs, pydantic (for simple cases), and manual __init__ entirely.
Feature 1: field() — Far More Than a Default Value
from dataclasses import dataclass, field
from typing import ClassVar
@dataclass
class APIRequest:
url: str
method: str = 'GET'
# field() with factory — ALWAYS use for mutable defaults
headers: dict = field(default_factory=dict) # ✅ new dict per instance
# headers: dict = {} # ❌ shared across all instances!
# field() with repr=False — hide sensitive data from print()
api_key: str = field(default='', repr=False)
# field() with compare=False — exclude from == comparison
request_id: str = field(default='', compare=False)
# field() with metadata — attach schema info, validation rules, etc.
timeout: int = field(default=30, metadata={'min': 1, 'max': 300, 'unit': 'seconds'})
# field() with init=False — computed, not passed to __init__
_session_id: str = field(default='', init=False, repr=False)
# Access metadata programmatically:
from dataclasses import fields
for f in fields(APIRequest):
if f.metadata.get('unit'):
print(f"{f.name}: {f.metadata}") # timeout: {'min': 1, 'max': 300, 'unit': 'seconds'}
Feature 2: __post_init__ — Validation and Computed Fields
Python advanced features
→ Complete Python Bootcamp (Udemy) — Full dataclasses module including slots=True, frozen, and field() patterns.
Sponsored links. We may earn a commission at no extra cost to you.
from dataclasses import dataclass, field
from datetime import datetime
import re
@dataclass
class User:
name: str
email: str
age: int
# Computed field — set in __post_init__, not __init__
username: str = field(init=False)
created_at: datetime = field(init=False)
def __post_init__(self):
# Validation
if self.age < 0 or self.age > 150:
raise ValueError(f"Invalid age: {self.age}")
if not re.match(r'^[^@]+@[^@]+.[^@]+$', self.email):
raise ValueError(f"Invalid email: {self.email}")
# Computed fields
self.username = self.email.split('@')[0].lower()
self.created_at = datetime.utcnow()
# Transform inputs
self.name = self.name.strip().title()
self.email = self.email.lower()
u = User(name=" alice smith ", email="Alice@Example.COM", age=30)
print(u.name) # Alice Smith (transformed)
print(u.email) # alice@example.com (normalized)
print(u.username) # alice (computed)
print(u.created_at) # 2026-04-05 ... (auto-set)
Feature 3: frozen=True — Hashable Immutable Objects
from dataclasses import dataclass
@dataclass(frozen=True)
class Point:
x: float
y: float
# You CAN add methods to frozen dataclasses
def distance_to(self, other: 'Point') -> float:
return ((self.x - other.x)**2 + (self.y - other.y)**2) ** 0.5
p1 = Point(1.0, 2.0)
p2 = Point(3.0, 4.0)
# Immutable — assignment raises FrozenInstanceError
p1.x = 5.0 # ❌ FrozenInstanceError: cannot assign to field 'x'
# Hashable — can be used as dict keys or in sets!
cache = {p1: 'computed_value', p2: 'other_value'}
point_set = {p1, p2} # Works because frozen=True generates __hash__
# Equal objects have equal hashes
p3 = Point(1.0, 2.0)
print(p1 == p3) # True
print(hash(p1) == hash(p3)) # True — safe for dict keys
Feature 4: slots=True — 40-60% Memory Reduction
import sys
from dataclasses import dataclass
@dataclass
class PointNormal:
x: float
y: float
z: float
@dataclass(slots=True) # Python 3.10+
class PointSlotted:
x: float
y: float
z: float
p1 = PointNormal(1.0, 2.0, 3.0)
p2 = PointSlotted(1.0, 2.0, 3.0)
print(sys.getsizeof(p1)) # 48 bytes
print(sys.getsizeof(p1.__dict__)) # 232 bytes (hidden dict overhead)
print(sys.getsizeof(p2)) # 64 bytes — no __dict__!
# 1 million instances:
# Normal: ~280MB
# Slotted: ~64MB ← 77% less memory
# Combine with frozen for hashable, memory-efficient, immutable objects:
@dataclass(frozen=True, slots=True)
class Vector3D:
x: float
y: float
z: float
Feature 5: ClassVar — Class-Level Data in Dataclasses
from dataclasses import dataclass
from typing import ClassVar
@dataclass
class DatabaseModel:
# ClassVar fields are NOT included in __init__, __repr__, or __eq__
# They're class-level, shared across all instances
table_name: ClassVar[str] = 'base_table'
connection_pool: ClassVar[object] = None
instance_count: ClassVar[int] = 0
# These ARE instance fields (included in __init__)
id: int
name: str
def __post_init__(self):
DatabaseModel.instance_count += 1
@classmethod
def set_pool(cls, pool):
cls.connection_pool = pool
@dataclass
class User(DatabaseModel):
table_name: ClassVar[str] = 'users' # Override class variable
email: str = ''
u1 = User(id=1, name='Alice', email='alice@example.com')
u2 = User(id=2, name='Bob', email='bob@example.com')
print(User.instance_count) # 2 — shared across instances
print(User.table_name) # 'users' — overridden in subclass
Feature 6: dataclass inheritance — with Gotchas
from dataclasses import dataclass
@dataclass
class Base:
id: int
name: str = 'default' # field with default
# ❌ This fails: non-default field after default field
@dataclass
class Child(Base):
email: str # No default — TypeError: non-default argument after default
# ✅ Fix 1: Give email a default
@dataclass
class Child(Base):
email: str = ''
# ✅ Fix 2: Use KW_ONLY (Python 3.10+) to make trailing fields keyword-only
from dataclasses import KW_ONLY
@dataclass
class Child(Base):
_: KW_ONLY
email: str # Now keyword-only — no ordering constraint
c = Child(id=1, email='alice@example.com') # name=default, email required as kwarg
Advanced Dataclass Cheat Sheet
- ✅
field(default_factory=list)— always use for mutable defaults - ✅
field(repr=False)— hide sensitive fields from repr/logs - ✅
__post_init__— validation, transformation, computed fields - ✅
frozen=True— immutable + hashable + dict key safe - ✅
slots=True(3.10+) — 40-60% memory reduction for many instances - ✅
ClassVar[T]— class-level data excluded from instance __init__ - ✅
KW_ONLY(3.10+) — enforce keyword-only fields in inheritance - ❌ Never use mutable default values directly (
field: list = [])
For maximum memory efficiency, combine slots=True with the techniques from the Python __slots__ deep dive — dataclasses with slots=True is actually how Python implements __slots__ for you automatically. For concurrent use of dataclasses with shared state, the Python GIL guide explains why frozen dataclasses are inherently thread-safe. Official reference: Python dataclasses documentation.
Recommended resources
- Fluent Python (2nd Edition) — The chapter on data class builders covers dataclasses, NamedTuple, and attrs side by side. Ramalho explains exactly when to use each, with benchmarks that inform the slots=True decision.
- Python Tricks: A Buffet of Awesome Python Features — Covers __dunder__ methods, descriptors, and class decorators — the machinery that makes dataclasses work internally.
Disclosure: This post contains affiliate links. If you purchase through these links, CheatCoders earns a small commission at no extra cost to you. We only recommend tools and books we genuinely find valuable.
Free Weekly Newsletter
🚀 Don’t Miss the Next Cheat Code
You just read something most developers never learn. Get more secrets like this delivered every week — JavaScript internals, Python optimizations, AWS architectures, system design, and AI workflows.
Join 1,000+ senior developers who actually level up. Zero fluff, pure signal.
Discover more from CheatCoders
Subscribe to get the latest posts sent to your email.

Pingback: Python typing Module: Generics and Protocols That Catch Real Bugs at Compile Time - CheatCoders