By default, the Dograh API container runs a single uvicorn worker. For production traffic β especially with many concurrent voice calls (long-lived WebSockets) β youβll want multiple workers. Dograh ships with built-in support for this: nginx load-balances across N independent uvicorn processes using aDocumentation Index
Fetch the complete documentation index at: https://docs.dograh.com/llms.txt
Use this file to discover all available pages before exploring further.
least_conn strategy.
This page covers how the multi-worker setup works, how to choose a worker count at install time, and how to change it on a running stack.
How it works
The API container startsFASTAPI_WORKERS separate uvicorn processes, each bound to its own port (8000, 8001, 8002, β¦). nginx exposes a single upstream dograh_api that includes all worker ports and routes new requests to whichever worker currently has the fewest active connections.
This is intentionally not
uvicorn --workers N (the built-in pre-fork mode). With pre-fork, the Linux kernel distributes new TCP connections across workers via accept() β fine for short HTTP requests, but long-lived WebSockets stick to whichever worker first accepted them. A handful of unlucky workers end up handling most of the streaming traffic while the others idle. Routing at the nginx layer with least_conn knows the actual per-worker connection count and distributes WebSockets evenly.ari_manager and campaign_orchestrator processes inside the API container stay as singletons regardless of FASTAPI_WORKERS β they coordinate global state (Asterisk channels, campaign scheduling) and should not be duplicated. ARQ background workers are controlled separately via ARQ_WORKERS.
Choosing a worker count
A safe starting point is one worker per available vCPU, capped at 8 unless youβve profiled your workload. The Remote Server Deployment prerequisites ask for a minimum of 4 vCPUs, so:| vCPUs | Suggested FASTAPI_WORKERS |
|---|---|
| 4 | 4 |
| 8 | 6β8 |
| 16+ | profile first |
Setting the worker count at install time
setup_remote.sh prompts for the worker count alongside the other configuration:
4) or enter a different positive integer. Non-interactive callers (cloud-init, CI, Terraform) can set the value via environment variable instead:
.envβ setsFASTAPI_WORKERS=N, whichdocker-compose.yamlsubstitutes into the API containerβs environment.nginx.confβ generates anupstream dograh_apiblock with oneserver api:800Xentry per worker.
Changing the worker count on a running stack
Once Dograh is running, increasing or decreasing the worker count is a two-file edit plus a restart. Youβll touch:.envβ controls how many uvicorn processes the API container spawns.nginx.confβ controls which worker ports nginx forwards to.
Steps
All commands run from yourdograh/ directory (the one with docker-compose.yaml).
1. Edit .env and change the FASTAPI_WORKERS line:
nginx.conf and update the upstream dograh_api block so it has exactly one server api:800X line per worker, with ports starting at 8000:
server lines so the list matches the new FASTAPI_WORKERS value.
3. Recreate the affected containers. The simplest path β brief downtime, no surprises:
api and nginx containers:
--force-recreate ensures the api container picks up the new FASTAPI_WORKERS value and nginx re-reads the updated nginx.conf (which is mounted read-only from disk).
4. Verify. Confirm the right number of uvicorn processes are running. The API image is slim and doesnβt include ps, so use Dockerβs host-side view instead:
Uvicorn running on http://0.0.0.0:800X line on boot:
Why not just re-run setup_remote.sh?
setup_remote.sh refuses to overwrite an existing install by design β re-running it would regenerate OSS_JWT_SECRET (logging everyone out), reset the TURN shared secret (breaking WebRTC auth on connected clients), and regenerate SSL certificates. The two-file edit above is the supported way to change worker count after install.
If you genuinely want a clean reinstall, see the DOGRAH_FORCE_OVERWRITE=1 escape hatch documented in the script.
What this does not scale
Multi-worker mode scales the HTTP/WebSocket API surface. It does not scale:- ARQ background workers β controlled by
ARQ_WORKERS(defaults to 1). Increase this in the API containerβs environment if your background job queue backs up. ari_manager/campaign_orchestratorβ singletons by design; they donβt benefit from extra processes.- Postgres, Redis, MinIO β each runs as a single container in the stack. For production-scale Postgres youβd run a managed service and point
DATABASE_URLat it; the same applies to Redis and S3-compatible storage.
least_conn upstream, just one layer higher.