Docker Best Practices: Build Smaller, Faster, More Secure Images in Production

Docker Best Practices: Build Smaller, Faster, More Secure Images in Production

Most production Docker images are 5–10x larger than they need to be, built 3x slower than possible, and running as root. These are not minor issues — they mean slower deployments, higher storage costs, and real attack surface. These best practices apply to every production container and typically reduce image sizes from 1.5GB to under 200MB.

TL;DR: Use multi-stage builds to separate build from runtime. Use Alpine or distroless base images. Put rarely-changing layers first (dependency installs before code copy). Never run as root. Never embed secrets in images. Scan with Trivy or Docker Scout. Use .dockerignore aggressively.

Multi-stage builds — the biggest size win

# BEFORE: single stage — includes build tools in production image
FROM node:20
WORKDIR /app
COPY package*.json ./
RUN npm install           # Includes devDependencies
COPY . .
RUN npm run build
CMD ["node", "dist/index.js"]
# Size: 1.2GB (includes gcc, python, build tools from node base)

# AFTER: multi-stage build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci                 # Reproducible install
COPY . .
RUN npm run build

FROM node:20-alpine AS runtime
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
# Only copy what runtime needs
USER node                  # Non-root user
CMD ["node", "dist/index.js"]
# Size: 180MB — 6.7x smaller

Layer caching — order layers by change frequency

# BAD: copy everything first — deps reinstalled on every code change
COPY . .
RUN npm install  # Cache miss on every code change!
RUN npm run build

# GOOD: dependencies change rarely — cache them separately
COPY package.json package-lock.json ./
RUN npm ci           # Cached until package.json changes
COPY . .             # Only invalidates layers below this
RUN npm run build

# Layer caching strategy:
# Layer 1: OS packages (changes monthly)
# Layer 2: Runtime deps (changes weekly)
# Layer 3: Application deps (changes daily)
# Layer 4: Application code (changes per commit)
# Each layer only invalidates layers BELOW it when it changes

Minimal base images

# Base image size comparison:
# node:20                  ~1.1GB  (Debian, full toolchain)
# node:20-slim             ~230MB  (Debian minimal)
# node:20-alpine           ~170MB  (Alpine Linux)
# gcr.io/distroless/nodejs ~50MB   (no shell, no package manager)

# Python:
# python:3.12              ~1.0GB
# python:3.12-slim         ~150MB
# python:3.12-alpine       ~55MB

# For production APIs: use -alpine or distroless
# For ML workloads with native extensions: use -slim (Alpine has musl, not glibc)

# Distroless — most secure (no shell = no exec vulnerabilities):
FROM python:3.12-slim AS builder
RUN pip install --target=/app/deps -r requirements.txt
COPY src/ /app/

FROM gcr.io/distroless/python3
COPY --from=builder /app /app
CMD ["/app/main.py"]

Security — never run as root

# Creating a non-root user:
FROM node:20-alpine
WORKDIR /app

# Create user with specific UID (not root=0)
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
COPY --chown=appuser:appgroup . .
USER appuser  # Switch to non-root
CMD ["node", "server.js"]

# Why non-root matters:
# If attacker achieves RCE in container:
# - Root: can read /etc/passwd, install tools, break out via kernel exploits
# - Non-root: very limited capabilities, cannot write to most paths

# Verify in CI:
# docker inspect my-image --format='{{.Config.User}}'
# Should not be empty or "root"

Secrets — never embed in image layers

# WRONG: secret embedded in image layer (visible in docker history)
RUN API_KEY=secret123 npm run build
# Even if you delete it in next layer, it remains in git history of layers

# WRONG: COPY secret file then delete (still in intermediate layer)
COPY .env .
RUN npm run build
RUN rm .env  # Too late! .env is in the layer above

# RIGHT: BuildKit secret mount (available in RUN, never in image)
# syntax=docker/dockerfile:1
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
    npm install

# Build with:
docker buildx build --secret id=npmrc,src=.npmrc .

# RIGHT: env var at runtime, not build time
# Pass secrets at container start:
docker run -e DATABASE_URL="$DATABASE_URL" my-app
# Or via: AWS Secrets Manager, Vault, Kubernetes Secrets

Health checks and .dockerignore

# Health check — orchestrators (ECS, K8s) use this to route traffic
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD wget -qO- http://localhost:3000/health || exit 1

# .dockerignore — critical for build speed and security
node_modules/     # Never copy — reinstall in container
.git/             # Repository history = huge and leaked info
*.log
.env*             # Environment files = credentials
dist/             # Build output (will rebuild in container)
test/             # Tests not needed in production image
Coverage/
README.md

# Without .dockerignore: COPY . . sends entire directory to Docker daemon
# With .dockerignore: only sends what the Dockerfile needs

Docker production checklist

  • ✅ Multi-stage builds — build artifacts only, not build tools in final image
  • ✅ Layer order: deps first, code second — maximize cache reuse
  • ✅ Alpine or distroless base — 5-10x smaller than debian defaults
  • ✅ Non-root USER — essential for production security
  • ✅ HEALTHCHECK in every production image
  • .dockerignore — exclude node_modules, .git, .env files
  • ✅ Scan with docker scout cves my-image or Trivy before every push
  • ❌ Never COPY . . before installing dependencies
  • ❌ Never embed secrets, API keys, or passwords in image layers
  • ❌ Never use latest tag in production — pin exact versions

Optimized Docker images are especially valuable for Lambda container deployments — the Lambda container images guide shows exactly how ECR layer caching interacts with the multi-stage build strategy. For Kubernetes deployments, these same practices apply to any container runtime. External reference: Docker official build best practices.

Recommended Books

Designing Data-Intensive Applications — The essential deep-dive on distributed systems, databases, and production engineering at scale.

The Pragmatic Programmer — Timeless principles for writing better code, debugging smarter, and advancing as an engineer.

Affiliate links. We earn a small commission at no extra cost to you.


Discover more from CheatCoders

Subscribe to get the latest posts sent to your email.