A comprehensive guide to deploying Node.js apps with PM2 cluster mode in production, covering installation, configuration, scaling, monitoring, and architectural considerations.

Introduction

What is PM2 Cluster Mode?

PM2 is a widely‑used process manager for Node.js applications. Its cluster mode leverages the Node.js cluster module to spawn multiple worker processes that share the same server port. This approach maximizes CPU utilization, improves request throughput, and provides fault tolerance by automatically restarting failed workers.

Why Use Cluster Mode in Production?

Full CPU Utilization - A single Node.js process runs on one core; cluster mode can spread the load across all available cores.
Zero‑Downtime Restarts - PM2 performs graceful reloads, keeping the service alive while new code is deployed.
Self‑Healing - If a worker crashes, PM2 instantly respawns it, preserving overall service stability.
Built‑in Monitoring - PM2 offers real‑time metrics, log aggregation, and a web dashboard (pm2‑plus).

In this tutorial we will walk through a production‑ready setup, from installation to scaling strategies, with concrete code examples and an architectural overview.

Installing and Configuring PM2

Global Installation

First, install PM2 globally on your deployment host. The global install makes the pm2 command available system‑wide:

bash npm install -g pm2

Verifying the Installation

Run the following command to confirm the version:

bash pm2 -v

You should see something like 5.3.0 (or newer).

Initializing a Node.js Project

Create a simple Express application that we will later run in cluster mode:

// app.js
const express = require('express');
const app = express();
const port = process.env.PORT || 3000;

app.get('/', (req, res) => { res.send('Hello from worker ' + process.pid); });

app.listen(port, () => { console.log(Server listening on port ${port}); });

Install the dependency:

bash npm init -y npm install express

Defining a PM2 Ecosystem File

PM2 can read a JSON or JavaScript ecosystem file that describes how your app should be launched. Create ecosystem.config.js with the following content:

module.exports = {
  apps: [
    {
      name: 'my‑api',
      script: './app.js',
      instances: 'max', // will be replaced by a numeric value in the next step
      exec_mode: 'cluster',
      env: {
        NODE_ENV: 'development'
      },
      env_production: {
        NODE_ENV: 'production'
      },
      watch: false,
      max_memory_restart: '300M'
    }
  ]
};

instances: 'max' tells PM2 to spawn as many workers as there are CPU cores.
exec_mode: 'cluster' enables cluster mode.
max_memory_restart ensures a worker that exceeds the memory threshold is automatically restarted.

Starting the Application in Development Mode

bash pm2 start ecosystem.config.

PM2 reads the `env` block by default. To use the production environment variables, add the `--env production` flag:

bash pm2 start ecosystem.config.js --env production

PM2 will now display a table with each worker’s PID, status, and memory usage.

Enabling Cluster Mode in Production

Determining the Desired Number of Instances

While instances: 'max' works for most workloads, some production environments require a fixed number of workers to avoid over‑committing resources. To set a specific count, edit the ecosystem file:

instances: 4, // Example: run four workers on a 4‑core host

Alternatively, you can use an environment variable so the same file works across environments:

instances: process.env.WORKER_COUNT || 'max',

Deploying with a Zero‑Downtime Reload

PM2 supports graceful reloads (pm2 reload) that restart workers one‑by‑one while the others continue serving traffic. This is essential for rolling out new code without service interruption.

bash

Deploy new version (assumes code is already updated on the server)

pm run build # if you have a build step pm2 reload my‑api --env production

During the reload, PM2 sends a SIGINT to the old worker, waits for existing connections to close, then spawns a fresh worker.

Managing Logs Efficiently

Production logs should be rotated to prevent disk exhaustion. PM2 integrates with logrotate and also offers built‑in log management:

bash pm2 logs my‑api # streams live logs pm2 flush # clears all log files pm2 reloadLogs my‑api # forces a log file rotation

For deeper log handling, configure the ecosystem file:

error_file: './logs/my-api-error.log',
out_file:   './logs/my-api-out.log',
log_date_format: 'YYYY-MM-DD HH:mm:ss Z',

Securing the Process with a Startup Script

To guarantee PM2 restarts after a server reboot, generate a startup script for your init system (systemd, upstart, etc.). PM2 will auto‑save the current process list.

bash pm2 startup systemd # prints a command; copy‑paste it as root pm2 save # saves the current process list

Now, the systemd service pm2‑root.service will resurrect your cluster automatically on boot.

Example: Full Production Startup Command

bash pm2 start ecosystem.config.js
--env production
--watch false
--max-restarts 10

This command enforces the production environment, disables file watching (which is costly in production), and limits restart attempts to protect against endless crash loops.

Architecture and Scaling Considerations

High‑Level Architecture Diagram (Textual)

+-------------------+ +---------------------------+ | Load Balancer | ---> | PM2 Cluster (Node.js) | | (e.g., Nginx) | | +----------+-----------+ | | round‑robin | | | Worker 1 | Worker 2 | ... | +-------------------+ | +----------+-----------+ | +---------------------------+

Load Balancer - A reverse proxy such as Nginx or HAProxy distributes HTTP requests across the front‑facing port (e.g., 80/443). It performs health checks and can terminate TLS.
PM2 Cluster - PM2 spawns N worker processes that all listen on the same TCP port (e.g., 3000). The Linux kernel’s SO_REUSEPORT socket option lets the kernel balance inbound connections among the workers.
Worker Process - Each worker runs the same Node.js event loop, handling requests independently. If a worker crashes, PM2 restarts it without affecting peers.

Benefits of This Architecture

Horizontal Scaling on a Single Host - By matching the number of workers to CPU cores, you achieve near‑linear scaling without additional servers.
Zero‑Downtime Deployments - pm2 reload ensures that at any point, at least N‑1 workers are alive, guaranteeing uninterrupted service.
Resilience - A single worker failure does not bring down the entire application; PM2’s watchdog immediately revives the process.
Observability - PM2 supplies metrics (CPU, memory, event loop latency) that can be exported to monitoring tools like Prometheus or Grafana.

Scaling Beyond a Single Host

When traffic exceeds the capacity of a single machine, combine PM2 clusters with a container orchestration platform (Kubernetes, Docker Swarm) or a cloud‑native auto‑scaler. Each host runs its own PM2 cluster, and an external load balancer distributes traffic across the fleet.

Example: Kubernetes Deployment YAML (Excerpt)

yaml apiVersion: apps/v1 kind: Deployment metadata: name: my-api spec: replicas: 3 selector: matchLabels: app: my-api template: metadata: labels: app: my-api spec: containers: - name: node image: my‑registry/my‑api:latest command: ["pm2-runtime", "ecosystem.config.js", "--env", "production"] ports: - containerPort: 3000 resources: limits: cpu: "500m" memory: "256Mi"

In this setup, PM2 Runtime (pm2-runtime) replaces the default node entrypoint, preserving all PM2 features (cluster mode, automatic restarts) inside each pod.

Monitoring Cluster Health

PM2 provides a built‑in monitoring tool (pm2 monit) and a REST API (pm2-web). For production dashboards you can pipe metrics to Prometheus using the pm2-exporter module:

bash npm install pm2-exporter --save pm2 start pm2-exporter --name pm2-exporter

Then scrape http://localhost:9209/metrics from Prometheus and visualize latency, request count, and worker restarts.

FAQs

1. Is cluster mode the same as running multiple Docker containers?

No. Cluster mode spawns multiple processes on a single OS instance, sharing the same memory space (except for heap limits). Docker containers provide process isolation at the OS level. For vertical scaling, cluster mode is simpler; for horizontal scaling across hosts, containerization is preferred.

2. What happens to existing connections during a `pm2 reload`?

PM2 sends a SIGINT to the old worker, which stops accepting new connections but continues processing open sockets until they finish or a timeout occurs. Only after the old worker exits does PM2 start a new one, ensuring there is never a moment with zero workers listening on the port.

3. How can I limit the memory usage of each worker?

Use the max_memory_restart option in the ecosystem file (e.g., max_memory_restart: '250M'). When a worker exceeds this threshold, PM2 automatically restarts it. Pair this with a proper --max-restarts strategy to avoid rapid crash loops.

4. Can I use PM2 with TypeScript projects?

Absolutely. Compile TypeScript to JavaScript before launching, or use ts-node with PM2:

bash pm2 start ts-node -- -r tsconfig-paths/register src/index.ts

Make sure to set exec_mode: 'cluster' and configure the appropriate instances.

5. Is it safe to store environment variables in the ecosystem file?

For development they are fine, but in production you should inject secrets via a secure vault (e.g., AWS Secrets Manager, HashiCorp Vault) or use OS‑level environment variables. PM2 can read them at start‑up without persisting them in the repository.

Conclusion

PM2’s cluster mode transforms a single‑core Node.js process into a robust, multi‑core service capable of handling production traffic with zero‑downtime deployments and automatic self‑healing. By following the steps outlined-installing PM2, defining an ecosystem file, configuring instances, and integrating monitoring-you can confidently run Node.js applications at scale.

Key takeaways:

Match the instances count to your CPU cores or allocate a fixed number based on performance testing.
Use pm2 reload for graceful deployments; never rely on pm2 restart in a live environment.
Persist the process list with pm2 save and generate a startup script to ensure continuity after reboots.
Extend observability with PM2’s built‑in tools or export metrics to external systems for centralized dashboards.
When traffic outgrows a single host, combine PM2 clusters with container orchestration or cloud load balancers for true horizontal scaling.

By integrating PM2 cluster mode into your DevOps pipeline, you gain a production‑ready process manager that simplifies scaling, improves reliability, and reduces operational overhead-all essential qualities for modern, high‑performance Node.js services.

home

about

Experience

Work

Contact

Blog

PM2 Cluster Mode in Production – Step‑by‑Step Tutorial

Introduction

What is PM2 Cluster Mode?

Why Use Cluster Mode in Production?

Installing and Configuring PM2

Global Installation

Verifying the Installation

Initializing a Node.js Project

Defining a PM2 Ecosystem File

Starting the Application in Development Mode

Enabling Cluster Mode in Production

Determining the Desired Number of Instances

Deploying with a Zero‑Downtime Reload

Deploy new version (assumes code is already updated on the server)

Managing Logs Efficiently

Securing the Process with a Startup Script

Example: Full Production Startup Command

Architecture and Scaling Considerations

High‑Level Architecture Diagram (Textual)

Benefits of This Architecture

Scaling Beyond a Single Host

Example: Kubernetes Deployment YAML (Excerpt)

Monitoring Cluster Health

FAQs

1. Is cluster mode the same as running multiple Docker containers?

2. What happens to existing connections during a `pm2 reload`?

3. How can I limit the memory usage of each worker?

4. Can I use PM2 with TypeScript projects?

5. Is it safe to store environment variables in the ecosystem file?

Conclusion

home

about

Experience

Work

Contact

Blog

PM2 Cluster Mode in Production – Step‑by‑Step Tutorial

Introduction

What is PM2 Cluster Mode?

Why Use Cluster Mode in Production?

Installing and Configuring PM2

Global Installation

Verifying the Installation

Initializing a Node.js Project

Defining a PM2 Ecosystem File

Starting the Application in Development Mode

Enabling Cluster Mode in Production

Determining the Desired Number of Instances

Deploying with a Zero‑Downtime Reload

Deploy new version (assumes code is already updated on the server)

Managing Logs Efficiently

Securing the Process with a Startup Script

Example: Full Production Startup Command

Architecture and Scaling Considerations

High‑Level Architecture Diagram (Textual)

Benefits of This Architecture

Scaling Beyond a Single Host

Example: Kubernetes Deployment YAML (Excerpt)

Monitoring Cluster Health

FAQs

1. Is cluster mode the same as running multiple Docker containers?

2. What happens to existing connections during a pm2 reload?

3. How can I limit the memory usage of each worker?

4. Can I use PM2 with TypeScript projects?

5. Is it safe to store environment variables in the ecosystem file?

Conclusion

2. What happens to existing connections during a `pm2 reload`?