Graceful shutdown
When you deploy a new version or scale down, the platform stops
the old instances. It doesn't kill them outright: first it sends them the SIGTERM
signal to ask them to terminate gracefully. A well-built process listens for that
signal and performs a graceful shutdown:
- Stops accepting new requests (closes the HTTP server).
- Finishes the requests already in progress.
- Closes open resources: database connections, queues, files.
- Exits the process (
process.exit(0)).
process.on("SIGTERM", async () => {
console.log("SIGTERM received, shutting down...");
server.close(); // 1 and 2: stop accepting new, drain the active ones
await db.disconnect(); // 3: close the DB
process.exit(0); // 4
});
Without this, a deployment would cut off requests halfway and leave connections hanging. A timeout is usually added: if it hasn't finished in, say, 10 seconds, the exit is forced so as not to get stuck.
Process managers (PM2)
In production you don't run node app.js by hand. A process manager like
PM2 takes care of:
- Restarting the process if it crashes (resilience).
- Running several instances in cluster mode to use all cores.
- Performing zero-downtime reloads (reload) on deployments.
- Centralizing the logs and exposing basic metrics.
Containers (Docker)
A Docker container packages your app with its environment (Node version, dependencies, base system) into an immutable image. The same image runs identically on your laptop and in production ("it works on my machine" is no longer an excuse). The config and secrets are injected from outside as environment variables (consistent with 12-factor), never inside the image.
CI/CD
A CI/CD pipeline (GitHub Actions, GitLab CI...) automates the path from
commit to production: on each push it runs the tests and the lint (Continuous
Integration) and, if they pass, builds the image and deploys it (Continuous
Delivery/Deployment). The goal is for deploying to be a boring, safe event,
not a risky ritual.
Horizontal scaling
To handle more load you have two paths: vertical (a bigger machine, with a physical limit) and horizontal (more instances behind a load balancer). Horizontal scaling is the basis of high availability, but it requires the server to be stateless: it must not keep state in memory (sessions, cache) that only lives in one instance; that state goes to a shared store (Redis, the DB). That way any instance can serve any request, and health checks + graceful shutdown allow adding and removing instances without the user noticing.
Examples
Skeleton of a graceful shutdown on SIGTERM
// Simulation: a server with a close method and a DB with disconnect.
const server = { close() { console.log("Server closed: not accepting new requests"); } };
const db = { async disconnect() { console.log("DB disconnected"); } };
async function shutdown(signal) {
console.log(signal + " received, starting graceful shutdown...");
server.close();
await db.disconnect();
console.log("Shutdown complete");
}
await shutdown("SIGTERM");