Operations
#django
#gunicorn

How to Optimize Gunicorn Workers for Django

If you want to optimize Gunicorn workers for Django, the main production risk is getting concurrency wrong.

Problem statement

If you want to optimize Gunicorn workers for Django, the main production risk is getting concurrency wrong.

Too few workers means requests queue behind busy processes, latency rises, and Nginx or Caddy may start returning 502 or 504 errors during traffic spikes. Too many workers means each Gunicorn process competes for CPU and RAM, which can increase context switching, trigger swapping, and make deployments unstable.

There is no universal best Gunicorn worker count for Django. The right settings depend on:

  • available CPU cores
  • available RAM
  • request profile
  • database latency
  • external API calls
  • worker class
  • timeout behavior
  • PostgreSQL connection limits

The goal is to choose a conservative starting point, test under realistic load, and change one variable at a time.

Quick answer

To tune Gunicorn workers for Django production safely:

  1. Start with sync workers and a small worker count based on CPU.
  2. Measure memory usage per worker before increasing concurrency.
  3. Use gthread only if the app is moderately I/O-bound and you understand the thread tradeoffs.
  4. Keep timeout, graceful_timeout, and max_requests aligned with safe reloads and worker recycling.
  5. Test with realistic endpoints, not only /healthz.
  6. Validate the Gunicorn config before reload, and keep a rollback config ready.

For many standard Django apps, a good first test is:

  • worker_class = "sync"
  • workers = 2 * CPU_CORES + 1 as a starting point, not a rule
  • timeout = 30
  • graceful_timeout = 30
  • max_requests = 1000
  • max_requests_jitter = 100

Step-by-step solution

1. Understand what Gunicorn workers do

Gunicorn uses worker processes to handle requests. With the default sync class, each worker handles one request at a time.

In a typical Django deployment:

  • Nginx or Caddy accepts client traffic
  • the reverse proxy forwards dynamic requests to Gunicorn
  • a Gunicorn worker runs Django code
  • Django may query PostgreSQL, Redis, or external services

This means worker tuning affects:

  • request concurrency
  • latency under load
  • memory footprint
  • database connection usage
  • behavior during deploy reloads

If requests are mostly CPU-bound, more workers than available CPU can make performance worse. If requests spend meaningful time waiting on DB or network I/O, a threaded model may help.

2. Collect baseline metrics before changing settings

Before changing Gunicorn workers for Django, capture current behavior.

Check system state:

free -m
top
ss -ltnp | grep 8000
systemctl status gunicorn
journalctl -u gunicorn -n 100 --no-pager
ps aux | grep gunicorn

What to measure:

  • CPU saturation
  • RAM usage before and after traffic
  • worker count actually running
  • p95 latency
  • 502 or 504 rate at the reverse proxy
  • worker timeout or restart messages in logs
  • PostgreSQL connection count

If CPU is low but latency is high, the bottleneck may be database or external I/O rather than Gunicorn itself. Do not raise concurrency until you know where time is being spent.

3. Choose an initial Gunicorn worker count for Django

For sync workers, start with the common baseline:

workers = 2 * CPU_CORES + 1

Use it as a first test only.

In practice, test fewer workers first when:

  • the server has limited RAM
  • each Django worker uses significant memory
  • PostgreSQL connection limits are tight
  • the app does heavy CPU work

A practical process:

  • start with 2 or 3 workers on a 1–2 vCPU VPS
  • test under load
  • measure memory per worker
  • increase gradually only if latency improves without memory pressure

Example starting points:

  • 1 vCPU / 2 GB RAM: workers = 2, worker_class = "sync"
  • 2 vCPU / 4 GB RAM: workers = 3 to 5, test both
  • 4 vCPU / 8 GB RAM: workers = 5 to 9, depending on memory and DB behavior

If one worker uses 300 MB under real traffic, eight workers already consume roughly 2.4 GB before accounting for the OS, Nginx, PostgreSQL, Redis, and filesystem cache.

4. Pick the right worker class

sync workers for typical Django apps

For most standard Django request/response apps, sync is the safest default. It is simple, predictable, and usually easier to debug.

gthread workers for moderate I/O-bound workloads

If the app spends noticeable time waiting on the database or external APIs, gthread can increase concurrency with fewer processes.

Example:

worker_class = "gthread"
threads = 4
workers = 2

Do not raise both workers and threads aggressively at the same time. Start small and watch memory, latency, and database pressure.

gevent and eventlet caveats

These models require more care around monkey patching and compatibility. For most Django deployments, they are not the first choice for straightforward production tuning.

When ASGI and Uvicorn are a better fit

If your application depends heavily on async views, websockets, or long-lived connections, moving toward ASGI with Uvicorn may be better than trying to force high concurrency through Gunicorn tuning alone.

5. Tune the Gunicorn settings that matter most

A practical gunicorn.conf.py:

bind = "127.0.0.1:8000"

workers = 3
worker_class = "sync"
threads = 1

timeout = 30
graceful_timeout = 30
keepalive = 2

max_requests = 1000
max_requests_jitter = 100

preload_app = False
backlog = 2048

accesslog = "/var/log/gunicorn/access.log"
errorlog = "/var/log/gunicorn/error.log"
loglevel = "info"

What matters:

  • workers: main process concurrency control
  • threads: useful with gthread; not relevant for sync
  • timeout: kill stuck workers after this many seconds
  • graceful_timeout: time allowed for workers to finish during restart or reload
  • keepalive: low values are usually fine behind Nginx or Caddy
  • max_requests and max_requests_jitter: recycle workers periodically to reduce the impact of gradual memory growth
  • preload_app: can reduce memory via copy-on-write in some cases, but increases startup sensitivity and needs careful testing
  • bind: usually localhost or a Unix socket behind a reverse proxy
  • backlog: pending connection queue size; not the first tuning lever, but it should not be left too small

Do not expose Gunicorn directly on a public interface unless you explicitly intend to.

Also keep secrets out of Gunicorn config. gunicorn.conf.py should contain runtime settings, not Django secrets or credentials.

6. Apply changes safely in production

Use a dedicated config file and a non-root service user.

Example systemd unit excerpt:

[Unit]
Description=Gunicorn for Django
After=network.target

[Service]
User=django
Group=www-data
WorkingDirectory=/srv/myapp/current
Environment="DJANGO_SETTINGS_MODULE=config.settings.production"
EnvironmentFile=/etc/myapp/myapp.env
ExecStart=/srv/myapp/venv/bin/gunicorn config.wsgi:application --config /etc/myapp/gunicorn.conf.py
ExecReload=/bin/kill -s HUP $MAINPID
Restart=on-failure
RestartSec=5
TimeoutStopSec=45
KillMode=mixed

[Install]
WantedBy=multi-user.target

Store secrets like Django settings values in EnvironmentFile, not directly in the unit when possible. Do not store secrets in gunicorn.conf.py, in the systemd unit directly, or in version-controlled deployment configs.

Validate the config before reload:

sudo /srv/myapp/venv/bin/gunicorn --check-config --config /etc/myapp/gunicorn.conf.py config.wsgi:application

Then reload and verify:

sudo systemctl daemon-reload
sudo systemctl reload gunicorn
sudo systemctl status gunicorn
sudo journalctl -u gunicorn -n 100 --no-pager
ps aux | grep gunicorn

Check the listener:

ss -ltnp | grep 8000

Then confirm the reverse proxy still passes traffic and health checks.

7. Test worker settings under realistic traffic

Use a dynamic endpoint that exercises templates, ORM, middleware, or authentication paths. Do not test only a trivial health endpoint.

Example with wrk:

wrk -t4 -c20 -d30s https://example.com/app/dashboard/

Or with ApacheBench:

ab -n 500 -c 20 https://example.com/app/dashboard/

Watch:

  • p95 latency
  • throughput
  • CPU utilization
  • RAM growth
  • worker restarts
  • Nginx or Caddy 502/504 responses
  • PostgreSQL connections

If throughput rises but p95 latency worsens and memory jumps, the new setting is not necessarily better.

8. Avoid common Gunicorn tuning mistakes

  • Oversubscribing CPU on small servers: more workers can increase waiting, not reduce it.
  • Ignoring database connection limits: higher app concurrency can increase simultaneous database usage.
  • Raising timeout instead of fixing slow queries: long timeouts hide problems and hold resources longer.
  • Using too many threads with blocking code: thread count is not free concurrency.
  • Forgetting memory per worker: worker count must fit alongside PostgreSQL, Redis, Nginx, and the OS.

9. Security and reliability notes

Keep Gunicorn behind Nginx or Caddy. Let the reverse proxy handle TLS, public exposure, buffering, and header normalization.

Also make sure to:

  • run Gunicorn as a non-root user
  • rotate or manage logs so disks do not fill
  • align reverse proxy timeouts and health checks with Gunicorn restart and timeout behavior
  • make sure static and media files are handled by the web server or object storage, not Gunicorn
  • avoid running migrations as part of Gunicorn startup
  • ensure Django production settings are correct as well, including DEBUG = False, valid ALLOWED_HOSTS, and CSRF_TRUSTED_ORIGINS where needed

If TLS terminates at Nginx or Caddy, make sure Django is proxy-aware:

SECURE_PROXY_SSL_HEADER = ("HTTP_X_FORWARDED_PROTO", "https")

Make sure your reverse proxy forwards X-Forwarded-Proto correctly, or Django may not detect HTTPS requests properly.

10. Rollback and recovery

If a new Gunicorn worker setting causes worse performance:

  1. Restore the previous config.
  2. Validate the restored config.
  3. Reload Gunicorn.
  4. Recheck latency, memory, and logs.

Example:

sudo cp /etc/myapp/gunicorn.conf.py.bak /etc/myapp/gunicorn.conf.py
sudo /srv/myapp/venv/bin/gunicorn --check-config --config /etc/myapp/gunicorn.conf.py config.wsgi:application
sudo systemctl reload gunicorn
sudo systemctl status gunicorn
sudo journalctl -u gunicorn -n 100 --no-pager
free -m

If reload does not recover cleanly, do a controlled restart during a safe window:

sudo systemctl restart gunicorn

Keep one tested last-known-good config file available for emergency restore.

Explanation

This approach works because it treats Gunicorn tuning as a capacity and safety problem, not just a single number to maximize.

sync workers are usually the correct default because they are operationally simple and map cleanly to common Django deployments. gthread can help when requests spend meaningful time waiting on I/O, but it should be introduced carefully and measured. max_requests helps with long-running process growth, while conservative timeout values prevent truly stuck workers from hanging forever.

The main reason Gunicorn performance tuning for Django goes wrong is that teams increase workers before checking memory, CPU, or database saturation. That can shift the bottleneck from request queueing to swapping, connection exhaustion, or noisy reload behavior.

When manual tuning becomes repetitive

If you manage several environments, manual edits to Gunicorn settings become error-prone. At that point, it makes sense to standardize a gunicorn.conf.py template, keep environment-specific values in variables, and script a repeatable validation sequence after each reload. Good first automation targets are config backup, safe reload, metrics capture, and post-change health checks.

Edge cases / notes

  • If your app has long-running requests, investigate query performance, background jobs, or upstream API latency before increasing timeout.
  • If preload_app = True, test DB connections, startup hooks, and memory behavior carefully before using it in production.
  • If you use containers, the same tuning rules apply, but CPU and memory limits from the orchestrator must be included in sizing.
  • If Nginx or Caddy has shorter upstream timeouts than Gunicorn, users may still see failures even when workers are alive.
  • Each additional Gunicorn process can indirectly increase PostgreSQL pressure if request concurrency rises.
  • If your workload is mostly async or connection-heavy, reconsider the stack instead of only increasing Gunicorn worker counts.

For related deployment topics, see:

FAQ

How many Gunicorn workers should a Django app use?

Start with 2 * CPU + 1 for sync workers, then test. On small servers, fewer workers are often better if memory is tight or the app is CPU-heavy.

Should I use sync or gthread workers for Django?

Use sync for most standard Django apps. Use gthread only when the workload is moderately I/O-bound and you have measured that threads improve latency or throughput.

Why did increasing Gunicorn workers make performance worse?

Common reasons are RAM pressure, CPU oversubscription, higher context switching, more database contention, or connection limits being reached.

Do more Gunicorn workers require more PostgreSQL connections?

Potentially yes. More app concurrency can lead to more simultaneous database activity, so PostgreSQL limits and pooling strategy should be considered during tuning.

2026 · django-deployment.com - Django Deployment knowledge base