Skip to content

Fix: Docker Compose depends_on Not Waiting for Service to Be Ready

FixDevs ·

Quick Answer

How to fix Docker Compose depends_on not working — services start in order but the app still crashes because depends_on only waits for container start, not service readiness. Includes healthcheck solutions.

The Error

You set depends_on in your docker-compose.yml to ensure services start in order, but your application still crashes on startup:

app_1   | Error: connect ECONNREFUSED 127.0.0.1:5432
app_1   | Connection refused — PostgreSQL not ready
db_1    | LOG:  database system is ready to accept connections

Or:

web_1   | redis.exceptions.ConnectionError: Error 111 connecting to redis:6379. Connection refused.

The app container starts before the database or Redis is ready to accept connections, even though depends_on is configured. The dependent service starts, but crashes before the dependency finishes initializing.

Why This Happens

depends_on in Docker Compose only controls container start order — it does not wait for the service inside the container to be ready. It starts containers in dependency order, but immediately moves to the next service as soon as the container process starts (not when the service is accepting connections).

From the Docker Compose documentation:

depends_on does not wait for db and redis to be “ready” before starting web — only until they have been started.

This is a common misconception. The database container may start in seconds, but PostgreSQL, MySQL, or Redis may take several more seconds to initialize, run migrations, or set up data directories before accepting connections.

Fix 1: Use healthcheck with depends_on condition (Compose v2.1+)

Docker Compose v2.1+ supports condition in depends_on combined with healthcheck. This is the correct, built-in solution:

version: "3.8"

services:
  db:
    image: postgres:15
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
      POSTGRES_DB: mydb
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d mydb"]
      interval: 5s
      timeout: 5s
      retries: 10
      start_period: 10s

  app:
    image: myapp:latest
    depends_on:
      db:
        condition: service_healthy  # Wait until db passes healthcheck
    environment:
      DATABASE_URL: postgres://user:password@db:5432/mydb

With condition: service_healthy, Compose waits until the db service’s healthcheck reports healthy before starting app.

Healthcheck commands for common services:

# PostgreSQL
healthcheck:
  test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
  interval: 5s
  timeout: 5s
  retries: 10

# MySQL / MariaDB
healthcheck:
  test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-p${MYSQL_ROOT_PASSWORD}"]
  interval: 5s
  timeout: 5s
  retries: 10

# Redis
healthcheck:
  test: ["CMD", "redis-cli", "ping"]
  interval: 5s
  timeout: 3s
  retries: 5

# MongoDB
healthcheck:
  test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
  interval: 10s
  timeout: 5s
  retries: 5

# Generic HTTP service
healthcheck:
  test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
  interval: 10s
  timeout: 5s
  retries: 5
  start_period: 15s

Pro Tip: Use start_period when a service takes a long time to initialize. During the start period, failed healthchecks do not count toward the retry limit — preventing false failures during initial startup. Set it slightly longer than the typical startup time of the service.

Fix 2: Add Retry Logic to Your Application

Even with healthchecks, network conditions or race conditions can cause connection failures. Build retry logic directly into your application:

Node.js — retry with exponential backoff:

const { Pool } = require("pg");

async function connectWithRetry(maxRetries = 10, delayMs = 2000) {
  const pool = new Pool({ connectionString: process.env.DATABASE_URL });

  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const client = await pool.connect();
      console.log("Database connected successfully");
      client.release();
      return pool;
    } catch (err) {
      console.error(`Attempt ${attempt}/${maxRetries} failed:`, err.message);
      if (attempt === maxRetries) throw err;
      await new Promise(resolve => setTimeout(resolve, delayMs * attempt));
    }
  }
}

module.exports = connectWithRetry();

Python — retry with tenacity:

from tenacity import retry, stop_after_attempt, wait_fixed
import psycopg2
import os

@retry(stop=stop_after_attempt(10), wait=wait_fixed(2))
def connect_to_db():
    conn = psycopg2.connect(os.environ["DATABASE_URL"])
    print("Database connected")
    return conn

db = connect_to_db()

Application-level retry is a good practice regardless of depends_on — in production, databases restart, network blips happen, and connections drop. An app that retries gracefully is more resilient than one that crashes on first failure.

Fix 3: Use a Wait Script (Legacy Approach)

Before condition: service_healthy was available, the common pattern was a wait-for-it.sh or dockerize script that polls until a port is open:

Using wait-for-it.sh:

# In your app's Dockerfile
COPY wait-for-it.sh /wait-for-it.sh
RUN chmod +x /wait-for-it.sh
services:
  app:
    image: myapp:latest
    command: ["/wait-for-it.sh", "db:5432", "--", "node", "server.js"]
    depends_on:
      - db

Download wait-for-it.sh from https://github.com/vishnubob/wait-for-it.

Using dockerize:

ENV DOCKERIZE_VERSION v0.7.0
RUN wget https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
    && tar -C /usr/local/bin -xzvf dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz
command: dockerize -wait tcp://db:5432 -timeout 60s node server.js

Note: The healthcheck + condition: service_healthy approach (Fix 1) is preferred over wait scripts — it is cleaner, does not require modifying the Dockerfile, and is officially supported.

Fix 4: Fix service_started and service_completed_successfully Conditions

Compose v2.1+ supports three conditions for depends_on:

depends_on:
  db:
    condition: service_started    # Default — just waits for container to start
  migrations:
    condition: service_completed_successfully  # Waits for a one-shot container to exit 0
  cache:
    condition: service_healthy    # Waits for healthcheck to pass

service_completed_successfully is useful for migration containers that run once and exit:

services:
  db:
    image: postgres:15
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 5s
      retries: 10

  migrate:
    image: myapp:latest
    command: ["npm", "run", "db:migrate"]
    depends_on:
      db:
        condition: service_healthy
    restart: "no"  # Don't restart after migration completes

  app:
    image: myapp:latest
    depends_on:
      db:
        condition: service_healthy
      migrate:
        condition: service_completed_successfully  # Wait for migrations to finish

This ensures: db starts → db is healthy → migrations run → app starts.

Fix 5: Fix restart Policy Masking the Real Issue

If your app has restart: always or restart: on-failure, Docker restarts it repeatedly when it fails to connect. Eventually the database is ready and the app connects — but the root cause (no readiness check) is hidden:

services:
  app:
    image: myapp:latest
    restart: on-failure   # Hides the depends_on problem
    depends_on:
      - db

This works in practice but is fragile. A restart loop wastes resources and generates misleading error logs. Use condition: service_healthy instead and keep restart: on-failure as a safety net, not the primary solution.

Common Mistake: Setting restart: always and calling it fixed. The application crashes and restarts 5–10 times before the database is ready. Each restart generates confusing error logs. Monitoring systems may alert on the crashes. Use healthchecks for a clean startup.

Fix 6: Debug depends_on Issues

Check if healthchecks are passing:

# Watch service health status
docker compose ps

# Or watch in real time
watch docker compose ps

# Check a specific service's health
docker inspect --format='{{json .State.Health}}' container_name | jq

Check healthcheck logs:

docker inspect container_name | jq '.[0].State.Health.Log'

This shows the last few healthcheck command outputs — useful to see why a healthcheck is failing.

Force a slow startup to reproduce the issue:

db:
  image: postgres:15
  command: ["sh", "-c", "sleep 10 && docker-entrypoint.sh postgres"]

Adding a sleep artificially delays the database, making the race condition obvious for debugging.

Fix 7: Multi-Service Dependency Chains

For complex dependency graphs:

services:
  postgres:
    image: postgres:15
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user"]
      interval: 5s
      retries: 10

  redis:
    image: redis:7
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      retries: 5

  api:
    image: myapi:latest
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  worker:
    image: myworker:latest
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  nginx:
    image: nginx:alpine
    depends_on:
      api:
        condition: service_started  # Just needs api container running, not healthy
    ports:
      - "80:80"

Note: condition: service_healthy requires the dependent service to have a healthcheck defined. If you add condition: service_healthy but forget the healthcheck, Compose raises an error:

service "db" is not healthy because it has no healthcheck defined

Still Not Working?

Check Compose file version. The condition field in depends_on requires Compose file version 2.1 or later (and Docker Compose v1.27+). The 3.x format supports it only with Docker Compose v2 (docker compose, not docker-compose):

# This works with docker compose (v2 CLI)
version: "3.8"
services:
  app:
    depends_on:
      db:
        condition: service_healthy

Run docker compose version to confirm you have Compose v2.

Check that the healthcheck command exits with 0 on success. A healthcheck command that always exits with a non-zero code keeps the service in an unhealthy state permanently. Test the healthcheck command inside the running container:

docker exec container_name pg_isready -U user
echo $?  # Should print 0 for healthy

Check for network issues between containers. Even when services are healthy, DNS resolution between containers requires them to be on the same Docker network. Make sure all services share a network:

services:
  db:
    networks:
      - app-network
  app:
    networks:
      - app-network

networks:
  app-network:
    driver: bridge

For other Docker startup errors, see Fix: Docker container exited with code 137 (OOMKilled) and Fix: Docker no space left on device.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles