Skip to content

Fix: Gunicorn Not Working — Worker Timeout, Boot Errors, and Signal Handling

FixDevs · (Updated: )

Part of:  Python Errors

Quick Answer

How to fix Gunicorn errors — WORKER TIMEOUT killed, ImportError cannot import app, worker class not found, connection refused 502 behind nginx, graceful reload not working, and sync vs async worker selection.

The Error

You start Gunicorn in production and workers keep getting killed:

[CRITICAL] WORKER TIMEOUT (pid:12345)
[ERROR] Worker (pid:12345) was sent SIGKILL! Perhaps out of memory?
[INFO] Booting worker with pid: 12346

Or Gunicorn can’t find your app module:

Failed to find attribute 'app' in 'main'.
Failed to find application object 'app' in 'main'

Or a worker class isn’t recognized:

ImportError: Entry point (gevent) not found
ImportError: No module named 'uvicorn.workers'

Or the app runs fine locally but returns 502 behind nginx:

$ curl https://myapp.com
502 Bad Gateway
# Gunicorn appears to be running

Or SIGHUP for graceful reload doesn’t restart workers:

kill -HUP $(cat /var/run/gunicorn.pid)
# Nothing happens, new code doesn't load

Gunicorn is the default WSGI server for Flask and Django apps — and the process manager of choice for FastAPI/Starlette via UvicornWorker. It’s mature and reliable, but configuration subtleties around workers, timeouts, and signal handling produce production failures that aren’t obvious from the error messages.

Why This Happens

Gunicorn runs a master process that forks N workers. Workers handle individual requests; the master handles lifecycle (restart, graceful reload, resource limits). Workers have a default timeout of 30 seconds — any request taking longer triggers WORKER TIMEOUT, and the master kills and replaces the worker.

The sync worker (default) uses one process per worker, blocking on each request. This works for fast APIs but dies on long-running endpoints. Async workers (gevent, eventlet, UvicornWorker) handle multiple requests per worker via coroutines — but require the app to be written with async in mind.

Diagnostic Timeline

Production Gunicorn incidents have a familiar shape. The first instinct is almost always wrong; here is what a real triage looks like.

Minute 0 — first guess: more workers. PagerDuty fires with elevated 502 rates. You assume capacity, double workers from 4 to 8, and restart. Latency stays bad and now the box swaps because each worker holds 400MB of model weights. More workers only help if the bottleneck is request concurrency, not memory or downstream IO.

Minute 5 — preload_app + database connections. You launched with --preload --workers 8 to save memory via copy-on-write. But the SQLAlchemy engine was created in the master before fork — every worker inherits the same connection pool with the same TCP sockets. Postgres terminates them as duplicates. Symptom: psycopg2.OperationalError: SSL connection has been closed unexpectedly in the first request per worker. Fix: dispose the engine in a post_fork hook so each worker rebuilds its pool.

def post_fork(server, worker):
    from myapp.db import engine
    engine.dispose()

Minute 12 — sync vs async worker class. You profile a slow endpoint and find it spends 1.2s waiting on an external HTTP call. With sync workers, each call ties up a whole process. Switching the same endpoint to --worker-class gevent (with monkey.patch_all() applied early) lets one worker handle hundreds of concurrent in-flight requests. But your code uses psycopg2, which is C-extension and incompatible with gevent monkey-patching — switch to psycopg[binary] v3 or pg8000, or pick gthread instead.

Minute 25 — signal SIGTERM timeout. Deploy script sends SIGTERM and waits 30 seconds before SIGKILL. Long requests (file uploads, batch jobs) get truncated mid-flight, customers see 502s. Set --graceful-timeout 60 and have your container orchestrator respect a higher terminationGracePeriodSeconds. Match it on the nginx side with proxy_send_timeout so the load balancer doesn’t kill the connection first.

Minute 40 — the real fix. Eight workers becomes four with gthread and 8 threads each, the engine rebuilds post-fork, the graceful timeout matches request P99, and the incident closes. The original “add workers” reflex would have made things worse.

Fix 1: Finding and Fixing WORKER TIMEOUT

[CRITICAL] WORKER TIMEOUT (pid:12345)

A worker didn’t respond to the master’s heartbeat within --timeout (default 30 seconds). The master killed it and spawned a replacement — your in-flight request was lost.

Identify the cause:

  1. Long-running synchronous request (DB query, file processing, external API)
  2. Worker hung on a bug (infinite loop, deadlock)
  3. Out of memory — worker killed by OS, not Gunicorn

Increase the timeout for legitimate slow endpoints:

gunicorn app:app --timeout 120   # 2 minutes

For endpoints that truly need hours (file uploads, ML inference), either:

  • Move to async workers (gevent, UvicornWorker)
  • Offload to a background job queue (Celery, RQ)
  • Stream responses to keep the connection alive

Differentiate timeout vs OOM kill:

# Check dmesg for OOM kills
dmesg | grep -i "out of memory"

# Or check if the kernel killed the process
sudo grep -i killed /var/log/syslog | tail

If the OS killed the process for OOM, increasing --timeout won’t help — you need more RAM or fewer workers.

Graceful timeout — give workers time to finish current requests during reload:

gunicorn app:app --timeout 60 --graceful-timeout 30
  • --timeout 60 — kill worker if no heartbeat for 60s
  • --graceful-timeout 30 — during reload, give workers 30s to finish before SIGKILL

Common Mistake: Setting --timeout 300 to “fix” mysterious timeouts in production. This masks the real problem (slow query, missing index, blocking call) and delays discovery until customers complain. Investigate why a request is slow before raising the timeout as a band-aid.

Fix 2: App Module Not Found

Failed to find attribute 'app' in 'main'.
Failed to find application object 'app' in 'main'

Gunicorn’s entry point format is module:variable. It imports module and looks for variable.

# main.py
from flask import Flask
app = Flask(__name__)   # Variable name "app"
gunicorn main:app   # module=main, variable=app

Common mistakes:

# WRONG — main.py vs main
gunicorn main.py:app   # Error: Failed to find app in 'main.py'

# WRONG — wrong variable name
gunicorn main:application   # If the variable is named 'app', not 'application'

# CORRECT
gunicorn main:app

# For Django, the WSGI app lives in project/wsgi.py
gunicorn myproject.wsgi:application   # Default Django WSGI name

# For factory pattern (Flask with create_app)
gunicorn "main:create_app()"   # Call the factory to get the app

pythonpath issues — if running from a different directory:

# Add to Python path
gunicorn --pythonpath /app main:app

# Or change working directory
gunicorn --chdir /app main:app

Fix 3: Worker Class Selection

Gunicorn supports several worker classes, each with different trade-offs:

Worker classBest forConcurrency model
sync (default)CPU-bound, fast endpointsOne request per worker
gthreadIO-bound sync appsThreaded workers
geventIO-bound sync appsCoroutines (green threads)
eventletSimilar to geventGreen threads
uvicorn.workers.UvicornWorkerASGI apps (FastAPI, Starlette)Async event loop
tornadoTornado appsTornado IO loop

Set via CLI:

# Default sync worker
gunicorn app:app --workers 4

# Threaded worker (for sync apps doing IO)
gunicorn app:app --workers 4 --threads 4 --worker-class gthread

# gevent (must pip install gevent first)
gunicorn app:app --workers 4 --worker-class gevent --worker-connections 1000

# For FastAPI/Starlette (must pip install uvicorn)
gunicorn app:app --workers 4 --worker-class uvicorn.workers.UvicornWorker

Install the right package:

pip install gevent        # For --worker-class gevent
pip install eventlet       # For --worker-class eventlet
pip install uvicorn        # For --worker-class uvicorn.workers.UvicornWorker

Sync vs async — when to switch:

  • Sync worker if your app is Flask/Django with fast endpoints (<100ms). Add workers to scale.
  • gthread or gevent if endpoints make blocking external calls (DB, HTTP). Lets one worker handle multiple requests concurrently while waiting on IO.
  • UvicornWorker if your app is FastAPI/Starlette. Sync workers won’t work with async ASGI apps.

Pro Tip: Start with sync workers and (2 × CPU_cores) + 1. Only switch to async workers if you see high request queue times and your app has lots of IO-bound operations (external API calls, DB queries). Don’t switch because async sounds faster — sync is simpler and often faster for CPU-bound workloads.

For uvicorn-specific worker configuration that Gunicorn wraps, see Uvicorn not working.

Fix 4: 502 Bad Gateway Behind Nginx

curl https://myapp.com
# 502 Bad Gateway

Nginx returned 502 — it couldn’t reach Gunicorn or Gunicorn responded badly.

Check Gunicorn is running and bound correctly:

# Gunicorn listening on 127.0.0.1:8000?
lsof -i :8000

# Or use a Unix socket
gunicorn app:app --bind unix:/tmp/gunicorn.sock

Nginx upstream config:

upstream app {
    server 127.0.0.1:8000;
    # Or via Unix socket
    # server unix:/tmp/gunicorn.sock;
}

server {
    listen 80;
    location / {
        proxy_pass http://app;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # Timeouts — match or exceed Gunicorn's timeout
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }
}

Common 502 causes:

  1. Gunicorn not listening on the port nginx expects
  2. Unix socket permissions wrong — nginx can’t read the socket
  3. Worker timeout shorter than nginx read timeout — Gunicorn kills its own worker mid-request, nginx sees a dead upstream
  4. All workers busy — no worker available to accept the connection

Fix socket permissions for Unix sockets:

# Run Gunicorn with permissions nginx can read
gunicorn app:app --bind unix:/tmp/gunicorn.sock --umask 007 --user www-data --group www-data

For nginx 502 diagnostic patterns, see nginx 502 bad gateway.

Fix 5: Graceful Reload — SIGHUP and Blue-Green Deployments

# Reload config and workers without downtime
kill -HUP $(cat /var/run/gunicorn.pid)

On SIGHUP, Gunicorn:

  1. Re-reads the config file
  2. Starts new workers
  3. Gracefully shuts down old workers (after --graceful-timeout)

Preload app for faster worker restarts:

gunicorn app:app --preload --workers 4

--preload loads the app in the master before forking workers. Workers share the preloaded memory via copy-on-write, reducing memory usage.

Caveat: with --preload, code changes require a full restart (not SIGHUP) because the preloaded code is already in the master.

Zero-downtime deploy pattern:

# 1. Update code
git pull origin main

# 2. Install any new dependencies
pip install -r requirements.txt

# 3. Graceful reload
kill -HUP $(cat /run/gunicorn.pid)

Worker cycling to mitigate memory leaks:

# Restart each worker after 1000 requests
gunicorn app:app --max-requests 1000 --max-requests-jitter 100

--max-requests-jitter adds randomness so not all workers recycle simultaneously (which would briefly drop capacity).

Fix 6: Configuration Files

CLI args get unwieldy. Use a gunicorn.conf.py:

# gunicorn.conf.py
import multiprocessing

bind = "0.0.0.0:8000"
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
worker_connections = 1000

timeout = 60
graceful_timeout = 30
keepalive = 5

# Restart workers periodically to mitigate memory leaks
max_requests = 1000
max_requests_jitter = 100

# Logging
accesslog = "-"   # stdout
errorlog = "-"    # stderr
loglevel = "info"

# Process naming for ps output
proc_name = "myapp"

# Pre-fork hook for shared resources
def post_fork(server, worker):
    # Each worker initializes its own resources
    import signal
    signal.signal(signal.SIGTERM, lambda *a: None)   # Custom shutdown handling

Use the config file:

gunicorn -c gunicorn.conf.py app:app

Environment-specific configs:

# gunicorn.conf.py
import os

env = os.getenv("ENV", "dev")

if env == "production":
    workers = 8
    loglevel = "warning"
    accesslog = "/var/log/gunicorn/access.log"
    errorlog = "/var/log/gunicorn/error.log"
else:
    workers = 2
    loglevel = "debug"
    reload = True   # Auto-reload for dev

Fix 7: Systemd Service for Production

# /etc/systemd/system/gunicorn.service
[Unit]
Description=Gunicorn daemon for myapp
Requires=gunicorn.socket
After=network.target

[Service]
Type=notify
User=www-data
Group=www-data
RuntimeDirectory=gunicorn
WorkingDirectory=/opt/myapp
Environment="PATH=/opt/myapp/venv/bin"
Environment="DATABASE_URL=postgresql://user:pass@localhost/db"
ExecStart=/opt/myapp/venv/bin/gunicorn -c /opt/myapp/gunicorn.conf.py app:app
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=mixed
TimeoutStopSec=60
PrivateTmp=true

[Install]
WantedBy=multi-user.target
# /etc/systemd/system/gunicorn.socket
[Unit]
Description=Gunicorn socket

[Socket]
ListenStream=/run/gunicorn/socket
SocketUser=www-data
SocketMode=0660

[Install]
WantedBy=sockets.target
sudo systemctl enable --now gunicorn.socket
sudo systemctl status gunicorn
sudo systemctl reload gunicorn   # Triggers ExecReload → SIGHUP

Socket activation (optional but elegant) — systemd holds the socket open, gunicorn gets it. Zero-downtime restarts without losing requests.

Fix 8: Logging and Monitoring

Access log format — customize fields:

gunicorn app:app \
    --access-log-format '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s %(D)s' \
    --access-logfile /var/log/gunicorn/access.log \
    --error-logfile /var/log/gunicorn/error.log

Variables:

VariableMeaning
%(h)sRemote IP
%(t)sTimestamp
%(r)sRequest line (method, path, HTTP version)
%(s)sStatus code
%(b)sResponse size in bytes
%(D)sRequest duration in microseconds
%(f)sReferer
%(a)sUser agent
%({X-Request-ID}i)sCustom header

Structured logging (JSON for log aggregation):

# gunicorn.conf.py
import json
import logging

class JsonFormatter(logging.Formatter):
    def format(self, record):
        return json.dumps({
            "time": self.formatTime(record),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
        })

logconfig_dict = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "json": {"()": JsonFormatter},
    },
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "formatter": "json",
        },
    },
    "loggers": {
        "gunicorn.error": {"level": "INFO", "handlers": ["console"]},
        "gunicorn.access": {"level": "INFO", "handlers": ["console"]},
    },
}

Prometheus metrics via prometheus_client:

pip install prometheus-client
# metrics.py
from prometheus_client import Counter, Histogram, start_http_server

REQUEST_COUNT = Counter("http_requests_total", "Total requests", ["method", "endpoint", "status"])
REQUEST_DURATION = Histogram("http_request_duration_seconds", "Request duration", ["endpoint"])

def init_metrics():
    start_http_server(9090)   # Metrics on :9090

# In app or middleware — increment on each request

Still Not Working?

Gunicorn vs uWSGI vs Uvicorn

  • Gunicorn — Python-only, stable, great for Flask/Django. Default choice. See this article.
  • uWSGI — Massive feature set (load balancer, cron, queue), steeper learning curve.
  • Uvicorn — ASGI, faster for async apps. See Uvicorn not working.
  • Bjoern / Meinheld — Minimal, fastest, limited features. Niche.

Testing with Gunicorn

In CI, test that the app starts correctly:

timeout 10 gunicorn app:app --check-config

--check-config validates the config without starting the app. For tests that actually serve requests, use a test client (Flask’s test_client(), Django’s Client).

For pytest fixture patterns that set up test servers, see pytest fixture not found.

Flask-Specific Integration

# wsgi.py
from myapp import create_app
application = create_app()
gunicorn wsgi:application --workers 4

For Flask-specific 404 and routing errors, see Flask 404 not found.

Docker Deployment

FROM python:3.12-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Don't run as root
RUN useradd -m appuser
USER appuser

EXPOSE 8000
CMD ["gunicorn", "-c", "gunicorn.conf.py", "app:app"]

In Kubernetes, run one worker per container and scale with replicas:

CMD ["gunicorn", "app:app", "--workers", "1", "--bind", "0.0.0.0:8000"]

Multiple replicas × 1 worker is more predictable than fewer replicas × many workers for horizontal scaling.

preload_app + Forked CUDA or PyTorch

Loading a PyTorch model in the master process before fork (“save memory!”) corrupts CUDA contexts in every worker. Symptom: RuntimeError: CUDA error: initialization error on the first request. Either drop --preload and pay the per-worker model load cost, or load the model lazily in a post_fork hook so each worker initializes CUDA itself.

File Descriptor Limits

Under load you see OSError: [Errno 24] Too many open files. The default ulimit on Linux is 1024; busy gunicorn easily exceeds it with keep-alive connections and DB pools. Raise it in your systemd unit (LimitNOFILE=65536) or in /etc/security/limits.conf. Confirm at runtime:

import resource
print(resource.getrlimit(resource.RLIMIT_NOFILE))

If the soft limit is still 1024, the process didn’t inherit the new limit — restart through systemd, not via kill -HUP.

Zombie Workers After Crashes

A worker that segfaults inside a C extension can leave a zombie process — ps shows it as <defunct>. The master reaps SIGCHLD but if you ran gunicorn under a non-PID-1 wrapper (some Docker entrypoints), zombies pile up. Use tini or set --init on docker run so PID 1 reaps children correctly.

Slow Startup Killing Health Checks

If your app takes 30s to load (ML weights, cache warming), Kubernetes liveness probes kill the container before it ever serves a request. Use a separate startupProbe with a generous failureThreshold so the slow boot is allowed once, while regular liveness stays strict.

F

FixDevs

Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.

Was this article helpful?

Related Articles