Most Python developers use dataclasses as a fancy way to avoid writing __init__. That’s like using a Swiss Army knife to open letters. Dataclasses have deep features for validation, computed fields, immutability, memory optimization, and serialization that can replace entire utility libraries when used correctly.
⚡ TL;DR: dataclasses aren’t just
__init__generators. Usefield()for defaults and metadata,__post_init__for validation,frozen=Truefor immutability,slots=Truefor memory efficiency, andClassVarfor class-level data. Together they replace attrs, pydantic (for simple cases), and manual __init__ entirely.
Feature 1: field() — Far More Than a Default Value
from dataclasses import dataclass, field
from typing import ClassVar
@dataclass
class APIRequest:
url: str
method: str = 'GET'
# field() with factory — ALWAYS use for mutable defaults
headers: dict = field(default_factory=dict) # ✅ new dict per instance
# headers: dict = {} # ❌ shared across all instances!
# field() with repr=False — hide sensitive data from print()
api_key: str = field(default='', repr=False)
# field() with compare=False — exclude from == comparison
request_id: str = field(default='', compare=False)
# field() with metadata — attach schema info, validation rules, etc.
timeout: int = field(default=30, metadata={'min': 1, 'max': 300, 'unit': 'seconds'})
# field() with init=False — computed, not passed to __init__
_session_id: str = field(default='', init=False, repr=False)
# Access metadata programmatically:
from dataclasses import fields
for f in fields(APIRequest):
if f.metadata.get('unit'):
print(f"{f.name}: {f.metadata}") # timeout: {'min': 1, 'max': 300, 'unit': 'seconds'}
Feature 2: __post_init__ — Validation and Computed Fields
Python advanced features
→ Complete Python Bootcamp (Udemy) — Full dataclasses module including slots=True, frozen, and field() patterns.
Sponsored links. We may earn a commission at no extra cost to you.
from dataclasses import dataclass, field
from datetime import datetime
import re
@dataclass
class User:
name: str
email: str
age: int
# Computed field — set in __post_init__, not __init__
username: str = field(init=False)
created_at: datetime = field(init=False)
def __post_init__(self):
# Validation
if self.age < 0 or self.age > 150:
raise ValueError(f"Invalid age: {self.age}")
if not re.match(r'^[^@]+@[^@]+.[^@]+$', self.email):
raise ValueError(f"Invalid email: {self.email}")
# Computed fields
self.username = self.email.split('@')[0].lower()
self.created_at = datetime.utcnow()
# Transform inputs
self.name = self.name.strip().title()
self.email = self.email.lower()
u = User(name=" alice smith ", email="Alice@Example.COM", age=30)
print(u.name) # Alice Smith (transformed)
print(u.email) # alice@example.com (normalized)
print(u.username) # alice (computed)
print(u.created_at) # 2026-04-05 ... (auto-set)
Feature 3: frozen=True — Hashable Immutable Objects
from dataclasses import dataclass
@dataclass(frozen=True)
class Point:
x: float
y: float
# You CAN add methods to frozen dataclasses
def distance_to(self, other: 'Point') -> float:
return ((self.x - other.x)**2 + (self.y - other.y)**2) ** 0.5
p1 = Point(1.0, 2.0)
p2 = Point(3.0, 4.0)
# Immutable — assignment raises FrozenInstanceError
p1.x = 5.0 # ❌ FrozenInstanceError: cannot assign to field 'x'
# Hashable — can be used as dict keys or in sets!
cache = {p1: 'computed_value', p2: 'other_value'}
point_set = {p1, p2} # Works because frozen=True generates __hash__
# Equal objects have equal hashes
p3 = Point(1.0, 2.0)
print(p1 == p3) # True
print(hash(p1) == hash(p3)) # True — safe for dict keys
Feature 4: slots=True — 40-60% Memory Reduction
import sys
from dataclasses import dataclass
@dataclass
class PointNormal:
x: float
y: float
z: float
@dataclass(slots=True) # Python 3.10+
class PointSlotted:
x: float
y: float
z: float
p1 = PointNormal(1.0, 2.0, 3.0)
p2 = PointSlotted(1.0, 2.0, 3.0)
print(sys.getsizeof(p1)) # 48 bytes
print(sys.getsizeof(p1.__dict__)) # 232 bytes (hidden dict overhead)
print(sys.getsizeof(p2)) # 64 bytes — no __dict__!
# 1 million instances:
# Normal: ~280MB
# Slotted: ~64MB ← 77% less memory
# Combine with frozen for hashable, memory-efficient, immutable objects:
@dataclass(frozen=True, slots=True)
class Vector3D:
x: float
y: float
z: float
Feature 5: ClassVar — Class-Level Data in Dataclasses
from dataclasses import dataclass
from typing import ClassVar
@dataclass
class DatabaseModel:
# ClassVar fields are NOT included in __init__, __repr__, or __eq__
# They're class-level, shared across all instances
table_name: ClassVar[str] = 'base_table'
connection_pool: ClassVar[object] = None
instance_count: ClassVar[int] = 0
# These ARE instance fields (included in __init__)
id: int
name: str
def __post_init__(self):
DatabaseModel.instance_count += 1
@classmethod
def set_pool(cls, pool):
cls.connection_pool = pool
@dataclass
class User(DatabaseModel):
table_name: ClassVar[str] = 'users' # Override class variable
email: str = ''
u1 = User(id=1, name='Alice', email='alice@example.com')
u2 = User(id=2, name='Bob', email='bob@example.com')
print(User.instance_count) # 2 — shared across instances
print(User.table_name) # 'users' — overridden in subclass
Feature 6: dataclass inheritance — with Gotchas
from dataclasses import dataclass
@dataclass
class Base:
id: int
name: str = 'default' # field with default
# ❌ This fails: non-default field after default field
@dataclass
class Child(Base):
email: str # No default — TypeError: non-default argument after default
# ✅ Fix 1: Give email a default
@dataclass
class Child(Base):
email: str = ''
# ✅ Fix 2: Use KW_ONLY (Python 3.10+) to make trailing fields keyword-only
from dataclasses import KW_ONLY
@dataclass
class Child(Base):
_: KW_ONLY
email: str # Now keyword-only — no ordering constraint
c = Child(id=1, email='alice@example.com') # name=default, email required as kwarg
Advanced Dataclass Cheat Sheet
- ✅
field(default_factory=list)— always use for mutable defaults - ✅
field(repr=False)— hide sensitive fields from repr/logs - ✅
__post_init__— validation, transformation, computed fields - ✅
frozen=True— immutable + hashable + dict key safe - ✅
slots=True(3.10+) — 40-60% memory reduction for many instances - ✅
ClassVar[T]— class-level data excluded from instance __init__ - ✅
KW_ONLY(3.10+) — enforce keyword-only fields in inheritance - ❌ Never use mutable default values directly (
field: list = [])
For maximum memory efficiency, combine slots=True with the techniques from the Python __slots__ deep dive — dataclasses with slots=True is actually how Python implements __slots__ for you automatically. For concurrent use of dataclasses with shared state, the Python GIL guide explains why frozen dataclasses are inherently thread-safe. Official reference: Python dataclasses documentation.
Recommended resources
- Fluent Python (2nd Edition) — The chapter on data class builders covers dataclasses, NamedTuple, and attrs side by side. Ramalho explains exactly when to use each, with benchmarks that inform the slots=True decision.
- Python Tricks: A Buffet of Awesome Python Features — Covers __dunder__ methods, descriptors, and class decorators — the machinery that makes dataclasses work internally.
Disclosure: This post contains affiliate links. If you purchase through these links, CheatCoders earns a small commission at no extra cost to you. We only recommend tools and books we genuinely find valuable.
Discover more from CheatCoders
Subscribe to get the latest posts sent to your email.

Pingback: Python typing Module: Generics and Protocols That Catch Real Bugs at Compile Time - CheatCoders