Node.js Cluster Module: Scale Your API to All CPU Cores Without Kubernetes

Node.js Cluster Module: Scale Your API to All CPU Cores Without Kubernetes

Your Node.js server runs on one CPU core by default, leaving 7 cores completely idle on a modern server. The cluster module fixes this in under 50 lines of code. No Docker, no Kubernetes, no load balancer config — just Node.js using all the hardware you’re already paying for.

TL;DR: Node.js cluster forks your process once per CPU core. The master process receives all connections and distributes them across workers. Workers share the same port. If a worker crashes, spawn a new one. This is free horizontal scaling with zero infrastructure changes.

Basic Cluster Setup

const cluster = require('cluster');
const http = require('http');
const os = require('os');

const NUM_WORKERS = os.cpus().length; // One per CPU core

if (cluster.isPrimary) {
  console.log(`Master ${process.pid} starting ${NUM_WORKERS} workers`);

  // Fork one worker per CPU
  for (let i = 0; i < NUM_WORKERS; i++) {
    cluster.fork();
  }

  // Restart workers that crash
  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died (${signal || code}). Restarting...`);
    cluster.fork();
  });

} else {
  // Workers share the TCP port
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`Response from worker ${process.pid}`);
  }).listen(3000);

  console.log(`Worker ${process.pid} started`);
}

// Before cluster: 1 core, 1 worker
// After cluster: 8 cores, 8 workers
// Throughput: ~8x improvement for CPU-bound work

Production-Grade Cluster with Zero-Downtime Restarts

const cluster = require('cluster');
const os = require('os');

if (cluster.isPrimary) {
  const workers = new Map();

  function spawnWorker() {
    const worker = cluster.fork();
    workers.set(worker.id, worker);

    worker.on('message', msg => {
      if (msg.type === 'ready') {
        console.log(`Worker ${worker.process.pid} ready`);
      }
    });
    return worker;
  }

  // Spawn initial workers
  for (let i = 0; i < os.cpus().length; i++) spawnWorker();

  // Zero-downtime restart: SIGUSR2 triggers rolling restart
  process.on('SIGUSR2', () => {
    console.log('Rolling restart initiated...');
    const workerIds = [...workers.keys()];

    function restartNext(i) {
      if (i >= workerIds.length) return;
      const worker = workers.get(workerIds[i]);
      worker.disconnect();
      worker.on('exit', () => {
        const newWorker = spawnWorker();
        newWorker.on('message', msg => {
          if (msg.type === 'ready') restartNext(i + 1);
        });
      });
    }
    restartNext(0);
  });

  cluster.on('exit', (worker) => {
    workers.delete(worker.id);
    spawnWorker(); // Auto-restart crashed workers
  });

} else {
  require('./app'); // Your Express/Fastify app
  process.send({ type: 'ready' }); // Signal master we're up
}

Worker Communication: Shared State Without Shared Memory

// Workers have separate memory — can't share variables
// Use message passing for coordination

// In worker:
process.send({ type: 'cache_invalidate', key: 'user:123' });

// In master — broadcast to all workers:
cluster.on('message', (sender, msg) => {
  if (msg.type === 'cache_invalidate') {
    Object.values(cluster.workers).forEach(worker => {
      if (worker.id !== sender.id) { // Don't echo back to sender
        worker.send(msg);
      }
    });
  }
});

// Worker receives broadcast:
process.on('message', msg => {
  if (msg.type === 'cache_invalidate') {
    localCache.delete(msg.key);
  }
});

// For true shared state: use Redis — not cluster messages

Monitoring Worker Health

// Health check each worker periodically
if (cluster.isPrimary) {
  setInterval(() => {
    Object.values(cluster.workers).forEach(worker => {
      worker.send({ type: 'health_check' });

      // If no response within 5s, kill and restart
      const timeout = setTimeout(() => {
        console.warn(`Worker ${worker.process.pid} unresponsive — killing`);
        worker.kill('SIGKILL');
      }, 5000);

      worker.once('message', msg => {
        if (msg.type === 'health_ok') clearTimeout(timeout);
      });
    });
  }, 30000); // Check every 30s
}

// In worker:
process.on('message', msg => {
  if (msg.type === 'health_check') {
    process.send({ type: 'health_ok', pid: process.pid, uptime: process.uptime() });
  }
});

Cluster vs PM2 vs Worker Threads

  • ✅ cluster: built-in, full control, good for I/O-bound Express/Fastify servers
  • ✅ PM2: manages cluster mode automatically, built-in health monitoring, recommended for production
  • ✅ Worker threads: CPU-bound tasks inside a single process (image processing, crypto)
  • ❌ cluster does NOT help for CPU-bound work in individual requests — use worker threads for that
  • ❌ Shared memory state across workers requires Redis or external store

Cluster works best when your workers are I/O-bound — see the event loop guide to eliminate blocking first. For worker thread patterns within each cluster worker, see the streams backpressure guide for high-throughput data processing. External reference: Node.js cluster module documentation.

Master Node.js performance and scaling

View Course on Udemy — Hands-on video course covering every concept in this post and more.

Sponsored link. We may earn a commission at no extra cost to you.


Discover more from CheatCoders

Subscribe to get the latest posts sent to your email.