Python Dataclasses: 10 Advanced Features That Make __init__ Obsolete

Python Dataclasses: 10 Advanced Features That Make __init__ Obsolete

Most Python developers use dataclasses as a fancy way to avoid writing __init__. That’s like using a Swiss Army knife to open letters. Dataclasses have deep features for validation, computed fields, immutability, memory optimization, and serialization that can replace entire utility libraries when used correctly.

TL;DR: dataclasses aren’t just __init__ generators. Use field() for defaults and metadata, __post_init__ for validation, frozen=True for immutability, slots=True for memory efficiency, and ClassVar for class-level data. Together they replace attrs, pydantic (for simple cases), and manual __init__ entirely.

Feature 1: field() — Far More Than a Default Value

from dataclasses import dataclass, field
from typing import ClassVar

@dataclass
class APIRequest:
    url: str
    method: str = 'GET'
    
    # field() with factory — ALWAYS use for mutable defaults
    headers: dict = field(default_factory=dict)  # ✅ new dict per instance
    # headers: dict = {}  # ❌ shared across all instances!
    
    # field() with repr=False — hide sensitive data from print()
    api_key: str = field(default='', repr=False)
    
    # field() with compare=False — exclude from == comparison
    request_id: str = field(default='', compare=False)
    
    # field() with metadata — attach schema info, validation rules, etc.
    timeout: int = field(default=30, metadata={'min': 1, 'max': 300, 'unit': 'seconds'})
    
    # field() with init=False — computed, not passed to __init__
    _session_id: str = field(default='', init=False, repr=False)

# Access metadata programmatically:
from dataclasses import fields
for f in fields(APIRequest):
    if f.metadata.get('unit'):
        print(f"{f.name}: {f.metadata}")  # timeout: {'min': 1, 'max': 300, 'unit': 'seconds'}

Feature 2: __post_init__ — Validation and Computed Fields

Python advanced features

Complete Python Bootcamp (Udemy) — Full dataclasses module including slots=True, frozen, and field() patterns.

Sponsored links. We may earn a commission at no extra cost to you.

from dataclasses import dataclass, field
from datetime import datetime
import re

@dataclass
class User:
    name: str
    email: str
    age: int
    # Computed field — set in __post_init__, not __init__
    username: str = field(init=False)
    created_at: datetime = field(init=False)

    def __post_init__(self):
        # Validation
        if self.age < 0 or self.age > 150:
            raise ValueError(f"Invalid age: {self.age}")
        if not re.match(r'^[^@]+@[^@]+.[^@]+$', self.email):
            raise ValueError(f"Invalid email: {self.email}")
        
        # Computed fields
        self.username = self.email.split('@')[0].lower()
        self.created_at = datetime.utcnow()
        
        # Transform inputs
        self.name = self.name.strip().title()
        self.email = self.email.lower()

u = User(name="  alice smith  ", email="Alice@Example.COM", age=30)
print(u.name)       # Alice Smith  (transformed)
print(u.email)      # alice@example.com (normalized)
print(u.username)   # alice (computed)
print(u.created_at) # 2026-04-05 ... (auto-set)

Feature 3: frozen=True — Hashable Immutable Objects

from dataclasses import dataclass

@dataclass(frozen=True)
class Point:
    x: float
    y: float

    # You CAN add methods to frozen dataclasses
    def distance_to(self, other: 'Point') -> float:
        return ((self.x - other.x)**2 + (self.y - other.y)**2) ** 0.5

p1 = Point(1.0, 2.0)
p2 = Point(3.0, 4.0)

# Immutable — assignment raises FrozenInstanceError
p1.x = 5.0  # ❌ FrozenInstanceError: cannot assign to field 'x'

# Hashable — can be used as dict keys or in sets!
cache = {p1: 'computed_value', p2: 'other_value'}
point_set = {p1, p2}  # Works because frozen=True generates __hash__

# Equal objects have equal hashes
p3 = Point(1.0, 2.0)
print(p1 == p3)   # True
print(hash(p1) == hash(p3))  # True — safe for dict keys

Feature 4: slots=True — 40-60% Memory Reduction

import sys
from dataclasses import dataclass

@dataclass
class PointNormal:
    x: float
    y: float
    z: float

@dataclass(slots=True)  # Python 3.10+
class PointSlotted:
    x: float
    y: float
    z: float

p1 = PointNormal(1.0, 2.0, 3.0)
p2 = PointSlotted(1.0, 2.0, 3.0)

print(sys.getsizeof(p1))  # 48 bytes
print(sys.getsizeof(p1.__dict__))  # 232 bytes (hidden dict overhead)
print(sys.getsizeof(p2))  # 64 bytes — no __dict__!

# 1 million instances:
# Normal:  ~280MB
# Slotted: ~64MB  ← 77% less memory

# Combine with frozen for hashable, memory-efficient, immutable objects:
@dataclass(frozen=True, slots=True)
class Vector3D:
    x: float
    y: float
    z: float

Feature 5: ClassVar — Class-Level Data in Dataclasses

from dataclasses import dataclass
from typing import ClassVar

@dataclass
class DatabaseModel:
    # ClassVar fields are NOT included in __init__, __repr__, or __eq__
    # They're class-level, shared across all instances
    table_name: ClassVar[str] = 'base_table'
    connection_pool: ClassVar[object] = None
    instance_count: ClassVar[int] = 0

    # These ARE instance fields (included in __init__)
    id: int
    name: str

    def __post_init__(self):
        DatabaseModel.instance_count += 1

    @classmethod
    def set_pool(cls, pool):
        cls.connection_pool = pool

@dataclass
class User(DatabaseModel):
    table_name: ClassVar[str] = 'users'  # Override class variable
    email: str = ''

u1 = User(id=1, name='Alice', email='alice@example.com')
u2 = User(id=2, name='Bob', email='bob@example.com')
print(User.instance_count)  # 2 — shared across instances
print(User.table_name)      # 'users' — overridden in subclass

Feature 6: dataclass inheritance — with Gotchas

from dataclasses import dataclass

@dataclass
class Base:
    id: int
    name: str = 'default'  # field with default

# ❌ This fails: non-default field after default field
@dataclass
class Child(Base):
    email: str  # No default — TypeError: non-default argument after default

# ✅ Fix 1: Give email a default
@dataclass
class Child(Base):
    email: str = ''

# ✅ Fix 2: Use KW_ONLY (Python 3.10+) to make trailing fields keyword-only
from dataclasses import KW_ONLY
@dataclass
class Child(Base):
    _: KW_ONLY
    email: str  # Now keyword-only — no ordering constraint

c = Child(id=1, email='alice@example.com')  # name=default, email required as kwarg

Advanced Dataclass Cheat Sheet

  • field(default_factory=list) — always use for mutable defaults
  • field(repr=False) — hide sensitive fields from repr/logs
  • __post_init__ — validation, transformation, computed fields
  • frozen=True — immutable + hashable + dict key safe
  • slots=True (3.10+) — 40-60% memory reduction for many instances
  • ClassVar[T] — class-level data excluded from instance __init__
  • KW_ONLY (3.10+) — enforce keyword-only fields in inheritance
  • ❌ Never use mutable default values directly (field: list = [])

For maximum memory efficiency, combine slots=True with the techniques from the Python __slots__ deep dive — dataclasses with slots=True is actually how Python implements __slots__ for you automatically. For concurrent use of dataclasses with shared state, the Python GIL guide explains why frozen dataclasses are inherently thread-safe. Official reference: Python dataclasses documentation.

Recommended resources

  • Fluent Python (2nd Edition) — The chapter on data class builders covers dataclasses, NamedTuple, and attrs side by side. Ramalho explains exactly when to use each, with benchmarks that inform the slots=True decision.
  • Python Tricks: A Buffet of Awesome Python Features — Covers __dunder__ methods, descriptors, and class decorators — the machinery that makes dataclasses work internally.

Disclosure: This post contains affiliate links. If you purchase through these links, CheatCoders earns a small commission at no extra cost to you. We only recommend tools and books we genuinely find valuable.


Discover more from CheatCoders

Subscribe to get the latest posts sent to your email.

1 Comment

Leave a Reply