Fix: Gunicorn Not Working — Worker Timeout, Boot Errors, and Signal Handling
Part of: Python Errors
Quick Answer
How to fix Gunicorn errors — WORKER TIMEOUT killed, ImportError cannot import app, worker class not found, connection refused 502 behind nginx, graceful reload not working, and sync vs async worker selection.
The Error
You start Gunicorn in production and workers keep getting killed:
[CRITICAL] WORKER TIMEOUT (pid:12345)
[ERROR] Worker (pid:12345) was sent SIGKILL! Perhaps out of memory?
[INFO] Booting worker with pid: 12346Or Gunicorn can’t find your app module:
Failed to find attribute 'app' in 'main'.
Failed to find application object 'app' in 'main'Or a worker class isn’t recognized:
ImportError: Entry point (gevent) not found
ImportError: No module named 'uvicorn.workers'Or the app runs fine locally but returns 502 behind nginx:
$ curl https://myapp.com
502 Bad Gateway
# Gunicorn appears to be runningOr SIGHUP for graceful reload doesn’t restart workers:
kill -HUP $(cat /var/run/gunicorn.pid)
# Nothing happens, new code doesn't loadGunicorn is the default WSGI server for Flask and Django apps — and the process manager of choice for FastAPI/Starlette via UvicornWorker. It’s mature and reliable, but configuration subtleties around workers, timeouts, and signal handling produce production failures that aren’t obvious from the error messages.
Why This Happens
Gunicorn runs a master process that forks N workers. Workers handle individual requests; the master handles lifecycle (restart, graceful reload, resource limits). Workers have a default timeout of 30 seconds — any request taking longer triggers WORKER TIMEOUT, and the master kills and replaces the worker.
The sync worker (default) uses one process per worker, blocking on each request. This works for fast APIs but dies on long-running endpoints. Async workers (gevent, eventlet, UvicornWorker) handle multiple requests per worker via coroutines — but require the app to be written with async in mind.
Diagnostic Timeline
Production Gunicorn incidents have a familiar shape. The first instinct is almost always wrong; here is what a real triage looks like.
Minute 0 — first guess: more workers. PagerDuty fires with elevated 502 rates. You assume capacity, double workers from 4 to 8, and restart. Latency stays bad and now the box swaps because each worker holds 400MB of model weights. More workers only help if the bottleneck is request concurrency, not memory or downstream IO.
Minute 5 — preload_app + database connections. You launched with --preload --workers 8 to save memory via copy-on-write. But the SQLAlchemy engine was created in the master before fork — every worker inherits the same connection pool with the same TCP sockets. Postgres terminates them as duplicates. Symptom: psycopg2.OperationalError: SSL connection has been closed unexpectedly in the first request per worker. Fix: dispose the engine in a post_fork hook so each worker rebuilds its pool.
def post_fork(server, worker):
from myapp.db import engine
engine.dispose()Minute 12 — sync vs async worker class. You profile a slow endpoint and find it spends 1.2s waiting on an external HTTP call. With sync workers, each call ties up a whole process. Switching the same endpoint to --worker-class gevent (with monkey.patch_all() applied early) lets one worker handle hundreds of concurrent in-flight requests. But your code uses psycopg2, which is C-extension and incompatible with gevent monkey-patching — switch to psycopg[binary] v3 or pg8000, or pick gthread instead.
Minute 25 — signal SIGTERM timeout. Deploy script sends SIGTERM and waits 30 seconds before SIGKILL. Long requests (file uploads, batch jobs) get truncated mid-flight, customers see 502s. Set --graceful-timeout 60 and have your container orchestrator respect a higher terminationGracePeriodSeconds. Match it on the nginx side with proxy_send_timeout so the load balancer doesn’t kill the connection first.
Minute 40 — the real fix. Eight workers becomes four with gthread and 8 threads each, the engine rebuilds post-fork, the graceful timeout matches request P99, and the incident closes. The original “add workers” reflex would have made things worse.
Fix 1: Finding and Fixing WORKER TIMEOUT
[CRITICAL] WORKER TIMEOUT (pid:12345)A worker didn’t respond to the master’s heartbeat within --timeout (default 30 seconds). The master killed it and spawned a replacement — your in-flight request was lost.
Identify the cause:
- Long-running synchronous request (DB query, file processing, external API)
- Worker hung on a bug (infinite loop, deadlock)
- Out of memory — worker killed by OS, not Gunicorn
Increase the timeout for legitimate slow endpoints:
gunicorn app:app --timeout 120 # 2 minutesFor endpoints that truly need hours (file uploads, ML inference), either:
- Move to async workers (
gevent,UvicornWorker) - Offload to a background job queue (Celery, RQ)
- Stream responses to keep the connection alive
Differentiate timeout vs OOM kill:
# Check dmesg for OOM kills
dmesg | grep -i "out of memory"
# Or check if the kernel killed the process
sudo grep -i killed /var/log/syslog | tailIf the OS killed the process for OOM, increasing --timeout won’t help — you need more RAM or fewer workers.
Graceful timeout — give workers time to finish current requests during reload:
gunicorn app:app --timeout 60 --graceful-timeout 30--timeout 60— kill worker if no heartbeat for 60s--graceful-timeout 30— during reload, give workers 30s to finish before SIGKILL
Common Mistake: Setting --timeout 300 to “fix” mysterious timeouts in production. This masks the real problem (slow query, missing index, blocking call) and delays discovery until customers complain. Investigate why a request is slow before raising the timeout as a band-aid.
Fix 2: App Module Not Found
Failed to find attribute 'app' in 'main'.
Failed to find application object 'app' in 'main'Gunicorn’s entry point format is module:variable. It imports module and looks for variable.
# main.py
from flask import Flask
app = Flask(__name__) # Variable name "app"gunicorn main:app # module=main, variable=appCommon mistakes:
# WRONG — main.py vs main
gunicorn main.py:app # Error: Failed to find app in 'main.py'
# WRONG — wrong variable name
gunicorn main:application # If the variable is named 'app', not 'application'
# CORRECT
gunicorn main:app
# For Django, the WSGI app lives in project/wsgi.py
gunicorn myproject.wsgi:application # Default Django WSGI name
# For factory pattern (Flask with create_app)
gunicorn "main:create_app()" # Call the factory to get the apppythonpath issues — if running from a different directory:
# Add to Python path
gunicorn --pythonpath /app main:app
# Or change working directory
gunicorn --chdir /app main:appFix 3: Worker Class Selection
Gunicorn supports several worker classes, each with different trade-offs:
| Worker class | Best for | Concurrency model |
|---|---|---|
sync (default) | CPU-bound, fast endpoints | One request per worker |
gthread | IO-bound sync apps | Threaded workers |
gevent | IO-bound sync apps | Coroutines (green threads) |
eventlet | Similar to gevent | Green threads |
uvicorn.workers.UvicornWorker | ASGI apps (FastAPI, Starlette) | Async event loop |
tornado | Tornado apps | Tornado IO loop |
Set via CLI:
# Default sync worker
gunicorn app:app --workers 4
# Threaded worker (for sync apps doing IO)
gunicorn app:app --workers 4 --threads 4 --worker-class gthread
# gevent (must pip install gevent first)
gunicorn app:app --workers 4 --worker-class gevent --worker-connections 1000
# For FastAPI/Starlette (must pip install uvicorn)
gunicorn app:app --workers 4 --worker-class uvicorn.workers.UvicornWorkerInstall the right package:
pip install gevent # For --worker-class gevent
pip install eventlet # For --worker-class eventlet
pip install uvicorn # For --worker-class uvicorn.workers.UvicornWorkerSync vs async — when to switch:
- Sync worker if your app is Flask/Django with fast endpoints (<100ms). Add workers to scale.
- gthread or gevent if endpoints make blocking external calls (DB, HTTP). Lets one worker handle multiple requests concurrently while waiting on IO.
- UvicornWorker if your app is FastAPI/Starlette. Sync workers won’t work with async ASGI apps.
Pro Tip: Start with sync workers and (2 × CPU_cores) + 1. Only switch to async workers if you see high request queue times and your app has lots of IO-bound operations (external API calls, DB queries). Don’t switch because async sounds faster — sync is simpler and often faster for CPU-bound workloads.
For uvicorn-specific worker configuration that Gunicorn wraps, see Uvicorn not working.
Fix 4: 502 Bad Gateway Behind Nginx
curl https://myapp.com
# 502 Bad GatewayNginx returned 502 — it couldn’t reach Gunicorn or Gunicorn responded badly.
Check Gunicorn is running and bound correctly:
# Gunicorn listening on 127.0.0.1:8000?
lsof -i :8000
# Or use a Unix socket
gunicorn app:app --bind unix:/tmp/gunicorn.sockNginx upstream config:
upstream app {
server 127.0.0.1:8000;
# Or via Unix socket
# server unix:/tmp/gunicorn.sock;
}
server {
listen 80;
location / {
proxy_pass http://app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts — match or exceed Gunicorn's timeout
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
}Common 502 causes:
- Gunicorn not listening on the port nginx expects
- Unix socket permissions wrong — nginx can’t read the socket
- Worker timeout shorter than nginx read timeout — Gunicorn kills its own worker mid-request, nginx sees a dead upstream
- All workers busy — no worker available to accept the connection
Fix socket permissions for Unix sockets:
# Run Gunicorn with permissions nginx can read
gunicorn app:app --bind unix:/tmp/gunicorn.sock --umask 007 --user www-data --group www-dataFor nginx 502 diagnostic patterns, see nginx 502 bad gateway.
Fix 5: Graceful Reload — SIGHUP and Blue-Green Deployments
# Reload config and workers without downtime
kill -HUP $(cat /var/run/gunicorn.pid)On SIGHUP, Gunicorn:
- Re-reads the config file
- Starts new workers
- Gracefully shuts down old workers (after
--graceful-timeout)
Preload app for faster worker restarts:
gunicorn app:app --preload --workers 4--preload loads the app in the master before forking workers. Workers share the preloaded memory via copy-on-write, reducing memory usage.
Caveat: with --preload, code changes require a full restart (not SIGHUP) because the preloaded code is already in the master.
Zero-downtime deploy pattern:
# 1. Update code
git pull origin main
# 2. Install any new dependencies
pip install -r requirements.txt
# 3. Graceful reload
kill -HUP $(cat /run/gunicorn.pid)Worker cycling to mitigate memory leaks:
# Restart each worker after 1000 requests
gunicorn app:app --max-requests 1000 --max-requests-jitter 100--max-requests-jitter adds randomness so not all workers recycle simultaneously (which would briefly drop capacity).
Fix 6: Configuration Files
CLI args get unwieldy. Use a gunicorn.conf.py:
# gunicorn.conf.py
import multiprocessing
bind = "0.0.0.0:8000"
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
worker_connections = 1000
timeout = 60
graceful_timeout = 30
keepalive = 5
# Restart workers periodically to mitigate memory leaks
max_requests = 1000
max_requests_jitter = 100
# Logging
accesslog = "-" # stdout
errorlog = "-" # stderr
loglevel = "info"
# Process naming for ps output
proc_name = "myapp"
# Pre-fork hook for shared resources
def post_fork(server, worker):
# Each worker initializes its own resources
import signal
signal.signal(signal.SIGTERM, lambda *a: None) # Custom shutdown handlingUse the config file:
gunicorn -c gunicorn.conf.py app:appEnvironment-specific configs:
# gunicorn.conf.py
import os
env = os.getenv("ENV", "dev")
if env == "production":
workers = 8
loglevel = "warning"
accesslog = "/var/log/gunicorn/access.log"
errorlog = "/var/log/gunicorn/error.log"
else:
workers = 2
loglevel = "debug"
reload = True # Auto-reload for devFix 7: Systemd Service for Production
# /etc/systemd/system/gunicorn.service
[Unit]
Description=Gunicorn daemon for myapp
Requires=gunicorn.socket
After=network.target
[Service]
Type=notify
User=www-data
Group=www-data
RuntimeDirectory=gunicorn
WorkingDirectory=/opt/myapp
Environment="PATH=/opt/myapp/venv/bin"
Environment="DATABASE_URL=postgresql://user:pass@localhost/db"
ExecStart=/opt/myapp/venv/bin/gunicorn -c /opt/myapp/gunicorn.conf.py app:app
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=60
PrivateTmp=true
[Install]
WantedBy=multi-user.target# /etc/systemd/system/gunicorn.socket
[Unit]
Description=Gunicorn socket
[Socket]
ListenStream=/run/gunicorn/socket
SocketUser=www-data
SocketMode=0660
[Install]
WantedBy=sockets.targetsudo systemctl enable --now gunicorn.socket
sudo systemctl status gunicorn
sudo systemctl reload gunicorn # Triggers ExecReload → SIGHUPSocket activation (optional but elegant) — systemd holds the socket open, gunicorn gets it. Zero-downtime restarts without losing requests.
Fix 8: Logging and Monitoring
Access log format — customize fields:
gunicorn app:app \
--access-log-format '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s %(D)s' \
--access-logfile /var/log/gunicorn/access.log \
--error-logfile /var/log/gunicorn/error.logVariables:
| Variable | Meaning |
|---|---|
%(h)s | Remote IP |
%(t)s | Timestamp |
%(r)s | Request line (method, path, HTTP version) |
%(s)s | Status code |
%(b)s | Response size in bytes |
%(D)s | Request duration in microseconds |
%(f)s | Referer |
%(a)s | User agent |
%({X-Request-ID}i)s | Custom header |
Structured logging (JSON for log aggregation):
# gunicorn.conf.py
import json
import logging
class JsonFormatter(logging.Formatter):
def format(self, record):
return json.dumps({
"time": self.formatTime(record),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
})
logconfig_dict = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"json": {"()": JsonFormatter},
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"formatter": "json",
},
},
"loggers": {
"gunicorn.error": {"level": "INFO", "handlers": ["console"]},
"gunicorn.access": {"level": "INFO", "handlers": ["console"]},
},
}Prometheus metrics via prometheus_client:
pip install prometheus-client# metrics.py
from prometheus_client import Counter, Histogram, start_http_server
REQUEST_COUNT = Counter("http_requests_total", "Total requests", ["method", "endpoint", "status"])
REQUEST_DURATION = Histogram("http_request_duration_seconds", "Request duration", ["endpoint"])
def init_metrics():
start_http_server(9090) # Metrics on :9090
# In app or middleware — increment on each requestStill Not Working?
Gunicorn vs uWSGI vs Uvicorn
- Gunicorn — Python-only, stable, great for Flask/Django. Default choice. See this article.
- uWSGI — Massive feature set (load balancer, cron, queue), steeper learning curve.
- Uvicorn — ASGI, faster for async apps. See Uvicorn not working.
- Bjoern / Meinheld — Minimal, fastest, limited features. Niche.
Testing with Gunicorn
In CI, test that the app starts correctly:
timeout 10 gunicorn app:app --check-config--check-config validates the config without starting the app. For tests that actually serve requests, use a test client (Flask’s test_client(), Django’s Client).
For pytest fixture patterns that set up test servers, see pytest fixture not found.
Flask-Specific Integration
# wsgi.py
from myapp import create_app
application = create_app()gunicorn wsgi:application --workers 4For Flask-specific 404 and routing errors, see Flask 404 not found.
Docker Deployment
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
# Don't run as root
RUN useradd -m appuser
USER appuser
EXPOSE 8000
CMD ["gunicorn", "-c", "gunicorn.conf.py", "app:app"]In Kubernetes, run one worker per container and scale with replicas:
CMD ["gunicorn", "app:app", "--workers", "1", "--bind", "0.0.0.0:8000"]Multiple replicas × 1 worker is more predictable than fewer replicas × many workers for horizontal scaling.
preload_app + Forked CUDA or PyTorch
Loading a PyTorch model in the master process before fork (“save memory!”) corrupts CUDA contexts in every worker. Symptom: RuntimeError: CUDA error: initialization error on the first request. Either drop --preload and pay the per-worker model load cost, or load the model lazily in a post_fork hook so each worker initializes CUDA itself.
File Descriptor Limits
Under load you see OSError: [Errno 24] Too many open files. The default ulimit on Linux is 1024; busy gunicorn easily exceeds it with keep-alive connections and DB pools. Raise it in your systemd unit (LimitNOFILE=65536) or in /etc/security/limits.conf. Confirm at runtime:
import resource
print(resource.getrlimit(resource.RLIMIT_NOFILE))If the soft limit is still 1024, the process didn’t inherit the new limit — restart through systemd, not via kill -HUP.
Zombie Workers After Crashes
A worker that segfaults inside a C extension can leave a zombie process — ps shows it as <defunct>. The master reaps SIGCHLD but if you ran gunicorn under a non-PID-1 wrapper (some Docker entrypoints), zombies pile up. Use tini or set --init on docker run so PID 1 reaps children correctly.
Slow Startup Killing Health Checks
If your app takes 30s to load (ML weights, cache warming), Kubernetes liveness probes kill the container before it ever serves a request. Use a separate startupProbe with a generous failureThreshold so the slow boot is allowed once, while regular liveness stays strict.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Marshmallow Not Working — Schema Errors, Load vs Dump, and Field Validation
How to fix Marshmallow errors — Schema not validated on dump, ValidationError messages format, unknown field handling, missing vs default, post_load object construction, and Marshmallow 3 to 4 migration.
Fix: ONNX Not Working — Conversion Errors, Runtime Provider Issues, and Dynamic Shape Problems
How to fix ONNX errors — torch.onnx.export unsupported operator, ONNX Runtime CUDA provider not found, InvalidArgument input shape mismatch, dynamic axes not working, IR version mismatch, and opset version conflicts.
Fix: Uvicorn Not Working — Worker Errors, Reload Issues, and Production Deployment
How to fix Uvicorn errors — Address already in use port binding, reload not detecting changes, SSL certificate errors, worker class with gunicorn, WebSocket disconnect, graceful shutdown, and proxy headers behind nginx.
Fix: Django REST Framework 403 Permission Denied
How to fix Django REST Framework 403 Forbidden and permission denied errors — authentication classes, permission classes, IsAuthenticated vs AllowAny, object-level permissions, and CSRF issues.