MeshWorld India LogoMeshWorld.
DockerDocker ComposeDevOpsProductionDeploymentHow-To7 min read

Docker Compose in Production: Best Practices & Tips

Vishnu
By Vishnu
Docker Compose in Production: Best Practices & Tips
TL;DR
  • Add restart: unless-stopped to every service — without it, crashed containers stay dead
  • Health checks + depends_on: condition: service_healthy prevent startup race conditions between services
  • Never put secrets in .env files on the server — use Docker secrets, CI/CD injection, or a secrets manager
  • Set max-size and max-file on log drivers — default JSON logging fills disk silently
  • Use resource limits (cpu/memory) to prevent one container from taking down the entire host

Docker Compose works great in development. In production, the same file will get you in trouble if you don’t change a few things. The defaults are built for convenience, not resilience.

This guide covers what to update before you point your domain at a Compose-based deployment.

What’s different in production

In development:

  • Containers restart manually
  • Secrets are in .env files
  • Logs go wherever
  • Containers share all resources freely
  • Health checks don’t matter

In production:

  • Containers must restart automatically
  • Secrets must not be in files on disk
  • Logs need to go to a log driver or external system
  • Resource limits prevent one container from killing the host
  • Health checks gate your load balancer and deployment logic

Use a separate production compose file

Don’t use one file for everything. Keep docker-compose.yml for shared config and docker-compose.prod.yml for production overrides:

bash
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

Restart policies

Without this, containers that crash stay dead:

yaml
services:
  app:
    restart: unless-stopped    # restart on crash, not on manual stop

Options:

  • no — never restart (dev default)
  • always — always restart, even on manual stop
  • on-failure — only restart on non-zero exit code
  • unless-stopped — restart always except when manually stopped (production default)

Health checks

Health checks let Docker know when a container is actually ready, not just running:

yaml
services:
  app:
    image: my-app:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s    # grace period on startup

  db:
    image: postgres:16
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

Use depends_on with condition to wait for a healthy dependency:

yaml
services:
  app:
    depends_on:
      db:
        condition: service_healthy

The scenario: You deploy your app and it starts in 3 seconds, but PostgreSQL takes 8 seconds to be ready for connections. Without health checks, your app crashes on startup trying to connect to a database that isn’t ready yet. With service_healthy, the app waits.

Resource limits

Without limits, one misbehaving container can take down the host:

yaml
services:
  app:
    deploy:
      resources:
        limits:
          cpus: '0.5'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 256M

Secrets management

Warning

Never commit .env files containing production secrets to your repo — even private repos. If the repo is ever made public, or a team member’s account is compromised, those secrets are exposed. Use .gitignore to exclude .env and inject secrets at deploy time via CI/CD or a secrets manager.

Never put production secrets in .env files committed to the repo or sitting on the server in plaintext.

Option 1: Docker secrets (Swarm mode)

yaml
secrets:
  db_password:
    external: true

services:
  db:
    secrets:
      - db_password
    environment:
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password

Option 2: Environment variables from a secure source

Use your CI/CD system (GitHub Actions, GitLab CI) to inject secrets at deploy time, or a secrets manager (Vault, AWS Secrets Manager) that your deploy script calls before starting Compose.

Option 3: .env file with restricted permissions

If you must use a .env file on the server, restrict access:

bash
chmod 600 /app/.env
chown appuser:appuser /app/.env

Never commit it. Add it to .gitignore.

Logging

Pro Tip

Always configure max-size and max-file on your log driver. The default JSON file driver writes unlimited logs to disk — a verbose service can fill a server disk in hours. Set max-size: "10m" and max-file: "3" as a minimum for every service in production.

Default JSON file logs fill up your disk. Configure a log driver:

yaml
services:
  app:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

For centralized logging, use loki, fluentd, or awslogs:

yaml
    logging:
      driver: awslogs
      options:
        awslogs-region: ap-south-1
        awslogs-group: /app/production
        awslogs-stream: app

Networking

By default, all services in a Compose file share one network. In production, segment your services:

yaml
networks:
  frontend:
  backend:

services:
  nginx:
    networks:
      - frontend

  app:
    networks:
      - frontend
      - backend

  db:
    networks:
      - backend    # not reachable from nginx directly

Production compose example

yaml
# docker-compose.prod.yml
services:
  app:
    image: my-app:${IMAGE_TAG:-latest}
    restart: unless-stopped
    environment:
      NODE_ENV: production
      DATABASE_URL: ${DATABASE_URL}
    ports:
      - "3000:3000"
    depends_on:
      db:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 1G
    logging:
      driver: json-file
      options:
        max-size: "20m"
        max-file: "5"
    networks:
      - frontend
      - backend

  db:
    image: postgres:16-alpine
    restart: unless-stopped
    environment:
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_DB: ${DB_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER}"]
      interval: 10s
      timeout: 5s
      retries: 5
    deploy:
      resources:
        limits:
          memory: 2G
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    networks:
      - backend

volumes:
  pgdata:

networks:
  frontend:
  backend:

Zero-downtime deploys

Compose doesn’t do rolling updates natively. For zero-downtime:

  1. Pull the new image: docker compose pull app
  2. Recreate with minimal downtime: docker compose up -d --no-deps app

Or use a proxy like Nginx/Traefik to route traffic while you swap containers.

For anything requiring true zero-downtime at scale, that’s when Docker Swarm or Kubernetes starts making sense.

Related: Docker Cheat Sheet | Write a Node.js Dockerfile


Summary

  • Change restart: no (dev default) to restart: unless-stopped for every service in production
  • Health checks let Compose and load balancers know when a container is actually ready — use depends_on with condition: service_healthy
  • Never put production secrets in .env files on the server — use Docker secrets, CI/CD injection, or a secrets manager
  • Add max-size and max-file to your logging config to prevent filling your disk with JSON logs
  • Use resource limits (cpus, memory) to prevent one misbehaving container from taking down the host

Frequently Asked Questions

Should I use Docker Compose or Kubernetes in production?

Compose is the right choice for small-to-medium deployments on a single server or a few servers. Kubernetes is for orchestrating containers across many nodes at scale. If you’re running 2–10 services on a VPS, Compose is simpler and maintainable. If you need auto-scaling, multi-zone redundancy, and rolling deploys across a cluster, that’s Kubernetes territory.

How do I update a service with zero downtime using Compose?

True zero-downtime with plain Compose requires a reverse proxy (Nginx, Traefik, Caddy) handling traffic. Pull the new image, start a new container on a different port, update the proxy config to point to it, then stop the old container. Traefik automates this with labels. For simpler setups, a brief (~1-2 second) restart with docker compose up -d --no-deps app is usually acceptable.

What’s the difference between depends_on and depends_on with condition: service_healthy?

Plain depends_on only waits for the container to start — not for the service inside to be ready. condition: service_healthy waits until the container’s health check passes. Always use the health condition for databases, caches, and any service that takes a few seconds to initialize.


Share_This Twitter / X
Vishnu
Written By

Vishnu

Founder & Principal Architect at MeshWorld. Senior engineer and instructor specializing in AI agent systems, scalable web architecture, and modern development workflows.

Enjoyed this article?

Support MeshWorld and help us create more technical content