:::note[TL;DR]
- Add
restart: unless-stoppedto every service — without it, crashed containers stay dead - Health checks +
depends_on: condition: service_healthyprevent startup race conditions between services - Never put secrets in
.envfiles on the server — use Docker secrets, CI/CD injection, or a secrets manager - Set
max-sizeandmax-fileon log drivers — default JSON logging fills disk silently - Use resource
limits(cpu/memory) to prevent one container from taking down the entire host :::
Docker Compose works great in development. In production, the same file will get you in trouble if you don’t change a few things. The defaults are built for convenience, not resilience.
This guide covers what to update before you point your domain at a Compose-based deployment.
What’s different in production
In development:
- Containers restart manually
- Secrets are in
.envfiles - Logs go wherever
- Containers share all resources freely
- Health checks don’t matter
In production:
- Containers must restart automatically
- Secrets must not be in files on disk
- Logs need to go to a log driver or external system
- Resource limits prevent one container from killing the host
- Health checks gate your load balancer and deployment logic
Use a separate production compose file
Don’t use one file for everything. Keep docker-compose.yml for shared config and docker-compose.prod.yml for production overrides:
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
Restart policies
Without this, containers that crash stay dead:
services:
app:
restart: unless-stopped # restart on crash, not on manual stop
Options:
no— never restart (dev default)always— always restart, even on manual stopon-failure— only restart on non-zero exit codeunless-stopped— restart always except when manually stopped (production default)
Health checks
Health checks let Docker know when a container is actually ready, not just running:
services:
app:
image: my-app:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s # grace period on startup
db:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
Use depends_on with condition to wait for a healthy dependency:
services:
app:
depends_on:
db:
condition: service_healthy
The scenario: You deploy your app and it starts in 3 seconds, but PostgreSQL takes 8 seconds to be ready for connections. Without health checks, your app crashes on startup trying to connect to a database that isn’t ready yet. With
service_healthy, the app waits.
Resource limits
Without limits, one misbehaving container can take down the host:
services:
app:
deploy:
resources:
limits:
cpus: '0.5'
memory: 512M
reservations:
cpus: '0.25'
memory: 256M
Secrets management
:::warning
Never commit .env files containing production secrets to your repo — even private repos. If the repo is ever made public, or a team member’s account is compromised, those secrets are exposed. Use .gitignore to exclude .env and inject secrets at deploy time via CI/CD or a secrets manager.
:::
Never put production secrets in .env files committed to the repo or sitting on the server in plaintext.
Option 1: Docker secrets (Swarm mode)
secrets:
db_password:
external: true
services:
db:
secrets:
- db_password
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
Option 2: Environment variables from a secure source
Use your CI/CD system (GitHub Actions, GitLab CI) to inject secrets at deploy time, or a secrets manager (Vault, AWS Secrets Manager) that your deploy script calls before starting Compose.
Option 3: .env file with restricted permissions
If you must use a .env file on the server, restrict access:
chmod 600 /app/.env
chown appuser:appuser /app/.env
Never commit it. Add it to .gitignore.
Logging
:::tip
Always configure max-size and max-file on your log driver. The default JSON file driver writes unlimited logs to disk — a verbose service can fill a server disk in hours. Set max-size: "10m" and max-file: "3" as a minimum for every service in production.
:::
Default JSON file logs fill up your disk. Configure a log driver:
services:
app:
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
For centralized logging, use loki, fluentd, or awslogs:
logging:
driver: awslogs
options:
awslogs-region: ap-south-1
awslogs-group: /app/production
awslogs-stream: app
Networking
By default, all services in a Compose file share one network. In production, segment your services:
networks:
frontend:
backend:
services:
nginx:
networks:
- frontend
app:
networks:
- frontend
- backend
db:
networks:
- backend # not reachable from nginx directly
Production compose example
# docker-compose.prod.yml
services:
app:
image: my-app:${IMAGE_TAG:-latest}
restart: unless-stopped
environment:
NODE_ENV: production
DATABASE_URL: ${DATABASE_URL}
ports:
- "3000:3000"
depends_on:
db:
condition: service_healthy
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
logging:
driver: json-file
options:
max-size: "20m"
max-file: "5"
networks:
- frontend
- backend
db:
image: postgres:16-alpine
restart: unless-stopped
environment:
POSTGRES_USER: ${DB_USER}
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_DB: ${DB_NAME}
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
interval: 10s
timeout: 5s
retries: 5
deploy:
resources:
limits:
memory: 2G
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
networks:
- backend
volumes:
pgdata:
networks:
frontend:
backend:
Zero-downtime deploys
Compose doesn’t do rolling updates natively. For zero-downtime:
- Pull the new image:
docker compose pull app - Recreate with minimal downtime:
docker compose up -d --no-deps app
Or use a proxy like Nginx/Traefik to route traffic while you swap containers.
For anything requiring true zero-downtime at scale, that’s when Docker Swarm or Kubernetes starts making sense.
Related: Docker Cheat Sheet | Write a Node.js Dockerfile
Summary
- Change
restart: no(dev default) torestart: unless-stoppedfor every service in production - Health checks let Compose and load balancers know when a container is actually ready — use
depends_onwithcondition: service_healthy - Never put production secrets in
.envfiles on the server — use Docker secrets, CI/CD injection, or a secrets manager - Add
max-sizeandmax-fileto your logging config to prevent filling your disk with JSON logs - Use resource limits (
cpus,memory) to prevent one misbehaving container from taking down the host
Frequently Asked Questions
Should I use Docker Compose or Kubernetes in production?
Compose is the right choice for small-to-medium deployments on a single server or a few servers. Kubernetes is for orchestrating containers across many nodes at scale. If you’re running 2–10 services on a VPS, Compose is simpler and maintainable. If you need auto-scaling, multi-zone redundancy, and rolling deploys across a cluster, that’s Kubernetes territory.
How do I update a service with zero downtime using Compose?
True zero-downtime with plain Compose requires a reverse proxy (Nginx, Traefik, Caddy) handling traffic. Pull the new image, start a new container on a different port, update the proxy config to point to it, then stop the old container. Traefik automates this with labels. For simpler setups, a brief (~1-2 second) restart with docker compose up -d --no-deps app is usually acceptable.
What’s the difference between depends_on and depends_on with condition: service_healthy?
Plain depends_on only waits for the container to start — not for the service inside to be ready. condition: service_healthy waits until the container’s health check passes. Always use the health condition for databases, caches, and any service that takes a few seconds to initialize.
What to Read Next
- Kubernetes Cheat Sheet — the next step when Compose isn’t enough
- GitHub Actions CI/CD Setup — automate your Compose deployments with a CI pipeline