Node.js Cluster Module: Scale Your API to All CPU Cores Without Kubernetes

Node.js Cluster Module: Scale Your API to All CPU Cores Without Kubernetes

Your Node.js server runs on one CPU core by default, leaving 7 cores completely idle on a modern server. The cluster module fixes this in under 50 lines of code. No Docker, no Kubernetes, no load balancer config — just Node.js using all the hardware you’re already paying for.

TL;DR: Node.js cluster forks your process once per CPU core. The master process receives all connections and distributes them across workers. Workers share the same port. If a worker crashes, spawn a new one. This is free horizontal scaling with zero infrastructure changes.

Basic Cluster Setup

const cluster = require('cluster');
const http = require('http');
const os = require('os');

const NUM_WORKERS = os.cpus().length; // One per CPU core

if (cluster.isPrimary) {
  console.log(`Master ${process.pid} starting ${NUM_WORKERS} workers`);

  // Fork one worker per CPU
  for (let i = 0; i < NUM_WORKERS; i++) {
    cluster.fork();
  }

  // Restart workers that crash
  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died (${signal || code}). Restarting...`);
    cluster.fork();
  });

} else {
  // Workers share the TCP port
  http.createServer((req, res) => {
    res.writeHead(200);
    res.end(`Response from worker ${process.pid}`);
  }).listen(3000);

  console.log(`Worker ${process.pid} started`);
}

// Before cluster: 1 core, 1 worker
// After cluster: 8 cores, 8 workers
// Throughput: ~8x improvement for CPU-bound work

Production-Grade Cluster with Zero-Downtime Restarts

const cluster = require('cluster');
const os = require('os');

if (cluster.isPrimary) {
  const workers = new Map();

  function spawnWorker() {
    const worker = cluster.fork();
    workers.set(worker.id, worker);

    worker.on('message', msg => {
      if (msg.type === 'ready') {
        console.log(`Worker ${worker.process.pid} ready`);
      }
    });
    return worker;
  }

  // Spawn initial workers
  for (let i = 0; i < os.cpus().length; i++) spawnWorker();

  // Zero-downtime restart: SIGUSR2 triggers rolling restart
  process.on('SIGUSR2', () => {
    console.log('Rolling restart initiated...');
    const workerIds = [...workers.keys()];

    function restartNext(i) {
      if (i >= workerIds.length) return;
      const worker = workers.get(workerIds[i]);
      worker.disconnect();
      worker.on('exit', () => {
        const newWorker = spawnWorker();
        newWorker.on('message', msg => {
          if (msg.type === 'ready') restartNext(i + 1);
        });
      });
    }
    restartNext(0);
  });

  cluster.on('exit', (worker) => {
    workers.delete(worker.id);
    spawnWorker(); // Auto-restart crashed workers
  });

} else {
  require('./app'); // Your Express/Fastify app
  process.send({ type: 'ready' }); // Signal master we're up
}

Worker Communication: Shared State Without Shared Memory

// Workers have separate memory — can't share variables
// Use message passing for coordination

// In worker:
process.send({ type: 'cache_invalidate', key: 'user:123' });

// In master — broadcast to all workers:
cluster.on('message', (sender, msg) => {
  if (msg.type === 'cache_invalidate') {
    Object.values(cluster.workers).forEach(worker => {
      if (worker.id !== sender.id) { // Don't echo back to sender
        worker.send(msg);
      }
    });
  }
});

// Worker receives broadcast:
process.on('message', msg => {
  if (msg.type === 'cache_invalidate') {
    localCache.delete(msg.key);
  }
});

// For true shared state: use Redis — not cluster messages

Monitoring Worker Health

// Health check each worker periodically
if (cluster.isPrimary) {
  setInterval(() => {
    Object.values(cluster.workers).forEach(worker => {
      worker.send({ type: 'health_check' });

      // If no response within 5s, kill and restart
      const timeout = setTimeout(() => {
        console.warn(`Worker ${worker.process.pid} unresponsive — killing`);
        worker.kill('SIGKILL');
      }, 5000);

      worker.once('message', msg => {
        if (msg.type === 'health_ok') clearTimeout(timeout);
      });
    });
  }, 30000); // Check every 30s
}

// In worker:
process.on('message', msg => {
  if (msg.type === 'health_check') {
    process.send({ type: 'health_ok', pid: process.pid, uptime: process.uptime() });
  }
});

Cluster vs PM2 vs Worker Threads

  • ✅ cluster: built-in, full control, good for I/O-bound Express/Fastify servers
  • ✅ PM2: manages cluster mode automatically, built-in health monitoring, recommended for production
  • ✅ Worker threads: CPU-bound tasks inside a single process (image processing, crypto)
  • ❌ cluster does NOT help for CPU-bound work in individual requests — use worker threads for that
  • ❌ Shared memory state across workers requires Redis or external store

Cluster works best when your workers are I/O-bound — see the event loop guide to eliminate blocking first. For worker thread patterns within each cluster worker, see the streams backpressure guide for high-throughput data processing. External reference: Node.js cluster module documentation.

Master Node.js performance and scaling

View Course on Udemy — Hands-on video course covering every concept in this post and more.

Sponsored link. We may earn a commission at no extra cost to you.


Discover more from CheatCoders

Subscribe to get the latest posts sent to your email.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply