Scaling

By default, the Dograh API container runs a single uvicorn worker. For production traffic — especially with many concurrent voice calls (long-lived WebSockets) — you’ll want multiple workers. Dograh ships with built-in support for this: nginx load-balances across N independent uvicorn processes using a least_conn strategy. This page covers how the multi-worker setup works, how to choose a worker count at install time, and how to change it on a running stack.

Multi-worker support requires Dograh v1.29.0 or newer. Earlier releases used uvicorn --workers and ship a different setup_remote.sh / start_services_docker.sh / nginx.conf layout — the steps below will not work on them. If your stack is older, update first and then come back to this page.

How it works

The API container starts FASTAPI_WORKERS separate uvicorn processes, each bound to its own port (8000, 8001, 8002, …). nginx exposes a single upstream dograh_api that includes all worker ports and routes new requests to whichever worker currently has the fewest active connections.

                       ┌───────────────────────────────────┐
                       │ api container                     │
                       │  uvicorn worker 0  → :8000        │
 browser ──► nginx ──► │  uvicorn worker 1  → :8001        │
   (443)    (least_conn)  uvicorn worker 2  → :8002        │
                       │  uvicorn worker 3  → :8003        │
                       └───────────────────────────────────┘

This is intentionally not uvicorn --workers N (the built-in pre-fork mode). With pre-fork, the Linux kernel distributes new TCP connections across workers via accept() — fine for short HTTP requests, but long-lived WebSockets stick to whichever worker first accepted them. A handful of unlucky workers end up handling most of the streaming traffic while the others idle. Routing at the nginx layer with least_conn knows the actual per-worker connection count and distributes WebSockets evenly.

The ari_manager and campaign_orchestrator processes inside the API container stay as singletons regardless of FASTAPI_WORKERS — they coordinate global state (Asterisk channels, campaign scheduling) and should not be duplicated. ARQ background workers are controlled separately via ARQ_WORKERS.

Choosing a worker count

A safe starting point is one worker per available vCPU, capped at 8 unless you’ve profiled your workload. The Remote Server Deployment prerequisites ask for a minimum of 4 vCPUs, so:

vCPUs	Suggested `FASTAPI_WORKERS`
4	4
8	6–8
16+	profile first

Each worker holds its own Python process and memory — budget roughly 300–500 MB RAM per worker in addition to the postgres/redis/minio overhead. If you’re near the 8 GB RAM minimum and see OOMs, drop the worker count before adding more.

Setting the worker count at install time

setup_remote.sh prompts for the worker count alongside the other configuration:

Number of FastAPI workers (uvicorn processes nginx will load-balance):
[4]:

Press Enter for the default (4) or enter a different positive integer. Non-interactive callers (cloud-init, CI, Terraform) can set the value via environment variable instead:

SERVER_IP=... TURN_SECRET=... FASTAPI_WORKERS=8 ./setup_remote.sh

The script wires the value into two places:

.env — sets FASTAPI_WORKERS=N, which docker-compose.yaml substitutes into the API container’s environment.
nginx.conf — generates an upstream dograh_api block with one server api:800X entry per worker.

Both must agree, which is why the script generates them together.

Changing the worker count on a running stack

Once Dograh is running, increasing or decreasing the worker count is a two-file edit plus a restart. You’ll touch:

.env — controls how many uvicorn processes the API container spawns.
nginx.conf — controls which worker ports nginx forwards to.

Both files must stay in sync. If .env says FASTAPI_WORKERS=8 but nginx.conf only lists 4 upstream servers, half your workers will be idle. If nginx.conf lists more upstreams than there are workers, those upstreams will throw connection errors and trip the proxy_next_upstream fallback.

Steps

All commands run from your dograh/ directory (the one with docker-compose.yaml). 1. Edit .env and change the FASTAPI_WORKERS line:

# Before
FASTAPI_WORKERS=4

# After
FASTAPI_WORKERS=8

2. Edit nginx.conf and update the upstream dograh_api block so it has exactly one server api:800X line per worker, with ports starting at 8000:

upstream dograh_api {
    least_conn;
    server api:8000 max_fails=3 fail_timeout=10s;
    server api:8001 max_fails=3 fail_timeout=10s;
    server api:8002 max_fails=3 fail_timeout=10s;
    server api:8003 max_fails=3 fail_timeout=10s;
    server api:8004 max_fails=3 fail_timeout=10s;   # ← new
    server api:8005 max_fails=3 fail_timeout=10s;   # ← new
    server api:8006 max_fails=3 fail_timeout=10s;   # ← new
    server api:8007 max_fails=3 fail_timeout=10s;   # ← new
    keepalive 32;
}

To scale down, remove the trailing server lines so the list matches the new FASTAPI_WORKERS value. 3. Recreate the affected containers. The simplest path — brief downtime, no surprises:

sudo docker compose --profile remote down
sudo docker compose --profile remote up -d

If you want to avoid downtime and your stack is healthy, you can recreate only the api and nginx containers:

sudo docker compose --profile remote up -d --force-recreate api nginx

--force-recreate ensures the api container picks up the new FASTAPI_WORKERS value and nginx re-reads the updated nginx.conf (which is mounted read-only from disk). 4. Verify. Confirm the right number of uvicorn processes are running. The API image is slim and doesn’t include ps, so use Docker’s host-side view instead:

sudo docker compose --profile remote top api | grep uvicorn

You should see one line per worker. To confirm the bound ports, check the startup logs — each worker logs an Uvicorn running on http://0.0.0.0:800X line on boot:

sudo docker compose --profile remote logs api | grep "Uvicorn running"

Then hit the API through nginx to confirm requests still flow:

curl -k https://YOUR_SERVER_IP/api/v1/health

Why not just re-run `setup_remote.sh`?

setup_remote.sh refuses to overwrite an existing install by design — re-running it would regenerate OSS_JWT_SECRET (logging everyone out), reset the TURN shared secret (breaking WebRTC auth on connected clients), and regenerate SSL certificates. The two-file edit above is the supported way to change worker count after install. If you genuinely want a clean reinstall, see the DOGRAH_FORCE_OVERWRITE=1 escape hatch documented in the script.

What this does not scale

Multi-worker mode scales the HTTP/WebSocket API surface. It does not scale:

ARQ background workers — controlled by ARQ_WORKERS (defaults to 1). Increase this in the API container’s environment if your background job queue backs up.
ari_manager / campaign_orchestrator — singletons by design; they don’t benefit from extra processes.
Postgres, Redis, MinIO — each runs as a single container in the stack. For production-scale Postgres you’d run a managed service and point DATABASE_URL at it; the same applies to Redis and S3-compatible storage.

For multi-machine horizontal scaling (separate API containers across hosts), see the Custom Domain guide for the load-balancer-in-front-of-multiple-hosts pattern — it’s the same idea as the in-container least_conn upstream, just one layer higher.

Guides

SDKs

Deployment

Contribution

How it works

Choosing a worker count

Setting the worker count at install time

Changing the worker count on a running stack

Steps

Why not just re-run `setup_remote.sh`?

What this does not scale

Guides

SDKs

Deployment

Contribution

Documentation Index

​How it works

​Choosing a worker count

​Setting the worker count at install time

​Changing the worker count on a running stack

​Steps

​Why not just re-run setup_remote.sh?

​What this does not scale

How it works

Choosing a worker count

Setting the worker count at install time

Changing the worker count on a running stack

Steps

Why not just re-run `setup_remote.sh`?

What this does not scale