Fix: Docker Compose depends_on Not Waiting for Service to Be Ready
Quick Answer
How to fix Docker Compose depends_on not working — services start in order but the app still crashes because depends_on only waits for container start, not service readiness. Includes healthcheck solutions.
The Error
You set depends_on in your docker-compose.yml to ensure services start in order, but your application still crashes on startup:
app_1 | Error: connect ECONNREFUSED 127.0.0.1:5432
app_1 | Connection refused — PostgreSQL not ready
db_1 | LOG: database system is ready to accept connectionsOr:
web_1 | redis.exceptions.ConnectionError: Error 111 connecting to redis:6379. Connection refused.The app container starts before the database or Redis is ready to accept connections, even though depends_on is configured. The dependent service starts, but crashes before the dependency finishes initializing.
Why This Happens
depends_on in Docker Compose only controls container start order — it does not wait for the service inside the container to be ready. It starts containers in dependency order, but immediately moves to the next service as soon as the container process starts (not when the service is accepting connections).
From the Docker Compose documentation:
depends_ondoes not wait fordbandredisto be “ready” before startingweb— only until they have been started.
This is a common misconception. The database container may start in seconds, but PostgreSQL, MySQL, or Redis may take several more seconds to initialize, run migrations, or set up data directories before accepting connections.
Fix 1: Use healthcheck with depends_on condition (Compose v2.1+)
Docker Compose v2.1+ supports condition in depends_on combined with healthcheck. This is the correct, built-in solution:
version: "3.8"
services:
db:
image: postgres:15
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: mydb
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user -d mydb"]
interval: 5s
timeout: 5s
retries: 10
start_period: 10s
app:
image: myapp:latest
depends_on:
db:
condition: service_healthy # Wait until db passes healthcheck
environment:
DATABASE_URL: postgres://user:password@db:5432/mydbWith condition: service_healthy, Compose waits until the db service’s healthcheck reports healthy before starting app.
Healthcheck commands for common services:
# PostgreSQL
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 5s
timeout: 5s
retries: 10
# MySQL / MariaDB
healthcheck:
test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-p${MYSQL_ROOT_PASSWORD}"]
interval: 5s
timeout: 5s
retries: 10
# Redis
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 5
# MongoDB
healthcheck:
test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
interval: 10s
timeout: 5s
retries: 5
# Generic HTTP service
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
interval: 10s
timeout: 5s
retries: 5
start_period: 15sPro Tip: Use
start_periodwhen a service takes a long time to initialize. During the start period, failed healthchecks do not count toward the retry limit — preventing false failures during initial startup. Set it slightly longer than the typical startup time of the service.
Fix 2: Add Retry Logic to Your Application
Even with healthchecks, network conditions or race conditions can cause connection failures. Build retry logic directly into your application:
Node.js — retry with exponential backoff:
const { Pool } = require("pg");
async function connectWithRetry(maxRetries = 10, delayMs = 2000) {
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
const client = await pool.connect();
console.log("Database connected successfully");
client.release();
return pool;
} catch (err) {
console.error(`Attempt ${attempt}/${maxRetries} failed:`, err.message);
if (attempt === maxRetries) throw err;
await new Promise(resolve => setTimeout(resolve, delayMs * attempt));
}
}
}
module.exports = connectWithRetry();Python — retry with tenacity:
from tenacity import retry, stop_after_attempt, wait_fixed
import psycopg2
import os
@retry(stop=stop_after_attempt(10), wait=wait_fixed(2))
def connect_to_db():
conn = psycopg2.connect(os.environ["DATABASE_URL"])
print("Database connected")
return conn
db = connect_to_db()Application-level retry is a good practice regardless of depends_on — in production, databases restart, network blips happen, and connections drop. An app that retries gracefully is more resilient than one that crashes on first failure.
Fix 3: Use a Wait Script (Legacy Approach)
Before condition: service_healthy was available, the common pattern was a wait-for-it.sh or dockerize script that polls until a port is open:
Using wait-for-it.sh:
# In your app's Dockerfile
COPY wait-for-it.sh /wait-for-it.sh
RUN chmod +x /wait-for-it.shservices:
app:
image: myapp:latest
command: ["/wait-for-it.sh", "db:5432", "--", "node", "server.js"]
depends_on:
- dbDownload wait-for-it.sh from https://github.com/vishnubob/wait-for-it.
Using dockerize:
ENV DOCKERIZE_VERSION v0.7.0
RUN wget https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& tar -C /usr/local/bin -xzvf dockerize-linux-amd64-$DOCKERIZE_VERSION.tar.gzcommand: dockerize -wait tcp://db:5432 -timeout 60s node server.jsNote: The healthcheck + condition: service_healthy approach (Fix 1) is preferred over wait scripts — it is cleaner, does not require modifying the Dockerfile, and is officially supported.
Fix 4: Fix service_started and service_completed_successfully Conditions
Compose v2.1+ supports three conditions for depends_on:
depends_on:
db:
condition: service_started # Default — just waits for container to start
migrations:
condition: service_completed_successfully # Waits for a one-shot container to exit 0
cache:
condition: service_healthy # Waits for healthcheck to passservice_completed_successfully is useful for migration containers that run once and exit:
services:
db:
image: postgres:15
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user"]
interval: 5s
retries: 10
migrate:
image: myapp:latest
command: ["npm", "run", "db:migrate"]
depends_on:
db:
condition: service_healthy
restart: "no" # Don't restart after migration completes
app:
image: myapp:latest
depends_on:
db:
condition: service_healthy
migrate:
condition: service_completed_successfully # Wait for migrations to finishThis ensures: db starts → db is healthy → migrations run → app starts.
Fix 5: Fix restart Policy Masking the Real Issue
If your app has restart: always or restart: on-failure, Docker restarts it repeatedly when it fails to connect. Eventually the database is ready and the app connects — but the root cause (no readiness check) is hidden:
services:
app:
image: myapp:latest
restart: on-failure # Hides the depends_on problem
depends_on:
- dbThis works in practice but is fragile. A restart loop wastes resources and generates misleading error logs. Use condition: service_healthy instead and keep restart: on-failure as a safety net, not the primary solution.
Common Mistake: Setting restart: always and calling it fixed. The application crashes and restarts 5–10 times before the database is ready. Each restart generates confusing error logs. Monitoring systems may alert on the crashes. Use healthchecks for a clean startup.
Fix 6: Debug depends_on Issues
Check if healthchecks are passing:
# Watch service health status
docker compose ps
# Or watch in real time
watch docker compose ps
# Check a specific service's health
docker inspect --format='{{json .State.Health}}' container_name | jqCheck healthcheck logs:
docker inspect container_name | jq '.[0].State.Health.Log'This shows the last few healthcheck command outputs — useful to see why a healthcheck is failing.
Force a slow startup to reproduce the issue:
db:
image: postgres:15
command: ["sh", "-c", "sleep 10 && docker-entrypoint.sh postgres"]Adding a sleep artificially delays the database, making the race condition obvious for debugging.
Fix 7: Multi-Service Dependency Chains
For complex dependency graphs:
services:
postgres:
image: postgres:15
healthcheck:
test: ["CMD-SHELL", "pg_isready -U user"]
interval: 5s
retries: 10
redis:
image: redis:7
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
retries: 5
api:
image: myapi:latest
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
worker:
image: myworker:latest
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
nginx:
image: nginx:alpine
depends_on:
api:
condition: service_started # Just needs api container running, not healthy
ports:
- "80:80"Note: condition: service_healthy requires the dependent service to have a healthcheck defined. If you add condition: service_healthy but forget the healthcheck, Compose raises an error:
service "db" is not healthy because it has no healthcheck definedStill Not Working?
Check Compose file version. The condition field in depends_on requires Compose file version 2.1 or later (and Docker Compose v1.27+). The 3.x format supports it only with Docker Compose v2 (docker compose, not docker-compose):
# This works with docker compose (v2 CLI)
version: "3.8"
services:
app:
depends_on:
db:
condition: service_healthyRun docker compose version to confirm you have Compose v2.
Check that the healthcheck command exits with 0 on success. A healthcheck command that always exits with a non-zero code keeps the service in an unhealthy state permanently. Test the healthcheck command inside the running container:
docker exec container_name pg_isready -U user
echo $? # Should print 0 for healthyCheck for network issues between containers. Even when services are healthy, DNS resolution between containers requires them to be on the same Docker network. Make sure all services share a network:
services:
db:
networks:
- app-network
app:
networks:
- app-network
networks:
app-network:
driver: bridgeFor other Docker startup errors, see Fix: Docker container exited with code 137 (OOMKilled) and Fix: Docker no space left on device.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: AWS ECS Task Failed to Start
How to fix ECS tasks that fail to start — port binding errors, missing IAM permissions, Secrets Manager access, essential container exit codes, and health check failures.
Fix: Docker Multi-Stage Build COPY --from Failed
How to fix Docker multi-stage build errors — COPY --from stage not found, wrong stage name, artifacts not at expected path, and BuildKit caching issues.
Fix: AWS ECR Authentication Failed (docker login and push Errors)
How to fix AWS ECR authentication errors — no basic auth credentials, token expired, permission denied on push, and how to authenticate correctly from CI/CD pipelines and local development.
Fix: Docker Build ARG Not Available in RUN Commands
How to fix Docker build ARG variables that are empty or undefined inside RUN commands — why ARG scope is limited, how ARG and ENV interact, multi-stage build ARG scoping, and secrets that shouldn't use ARG.