Performance and scaling
This guide helps you size web and job capacity for Strata self-hosted deployments (Docker Compose, ECS, Kamal).
Runtime roles
| Role | Processes | Scales with |
|---|---|---|
| Web | Thruster → Puma (WEB_CONCURRENCY workers × WEB_THREADS threads) | HTTP RPS and latency |
| Job | Solid Queue (JOB_CONCURRENCY processes × JOB_THREADS each in config/queue.yml) | Job backlog and completion time |
Use the same container image for both roles. Only the web role is exposed to users.
HTTP capacity (web)
Approximate concurrent request slots per web container:
slots = WEB_CONCURRENCY × WEB_THREADS
Example: WEB_CONCURRENCY=2, WEB_THREADS=5 → 10 slots per container.
For 50–200 RPS with typical Rails I/O-bound requests:
- Start with 2–4 web replicas (ECS tasks or Compose scale) at 2×5 slots each.
- Increase replica count before maxing threads (threads raise tail latency due to the Ruby GVL).
- Load test your workload (Turbo pages, API, exports) and watch p95 latency and CPU.
Postgres connections (web)
Each Puma worker process has its own connection pool per database role. Pool size comes from config/database.yml: DB_POOL_SIZE, or else WEB_THREADS (default 5).
effective_pool = DB_POOL_SIZE ?? WEB_THREADS ?? 5
primary_connections_per_web_task ≈ WEB_CONCURRENCY × effective_pool
Strata uses four databases (primary, queue, cache, cable). Plan max_connections on RDS for:
total ≈ (web_tasks + job_tasks) × pools × active_roles
Leave headroom for migrations, BI tools, and replicas.
Job capacity (job service)
Approximate concurrent job execution slots per job container:
slots = JOB_CONCURRENCY × JOB_THREADS
Default JOB_THREADS is 3 (config/queue.yml). Installer default: JOB_CONCURRENCY=4 → 12 slots.
Completed jobs per second (rough estimate):
throughput ≈ concurrent_slots / average_job_duration_seconds
Example: JOB_CONCURRENCY=4 × JOB_THREADS=3 = 12 slots; 10s average job duration → ~1.2 jobs/sec per container.
Application limits
Several jobs use limits_concurrency (for example, one active query per result set). High enqueue rates do not always translate into parallel execution — validate with production-like jobs (query_jobs, export_jobs, deploy).
Scaling jobs
- Increase job ECS service desired count (or
docker compose up --scale job=N). - Increase
JOB_CONCURRENCYorJOB_THREADSper task if CPU and DB connections allow. - Consider dedicated job services per queue (advanced) if query jobs dominate.
Recommended starting points
| Profile | Web | Job | Notes |
|---|---|---|---|
| Single host (Compose) | 1 container, WEB_CONCURRENCY=2, WEB_THREADS=5 | 1 job service, JOB_CONCURRENCY=4, JOB_THREADS=3 | Installer defaults |
| ECS (50–200 RPS) | 2–4 tasks, 2 vCPU / 4 GB | 2–4 tasks, 2 vCPU / 4 GB | Scale on CPU and queue depth |
| Stretch (high job rate) | Scale web separately | Many job tasks; benchmark before committing | See limits above |
Load testing checklist
Before scaling to large fleets:
- Measure average and p95 job duration for
query_jobsandexport_jobs. - Measure HTTP p95 under expected concurrent users (not just
/up). - Watch RDS connections, CPU, and Solid Queue ready execution count.
- Confirm
STRATA_RUN_DB_PREPARE=falseon all job tasks. - Confirm production web tasks do not process jobs (run a job service; omit
HANDLE_JOBS_IN_WEB_SERVERon web).
Example observability queries
Monitor Solid Queue depth in the queue database (solid_queue_ready_executions, solid_queue_jobs) and Solid Queue process heartbeats (solid_queue_processes).
Tools
Use your preferred HTTP load generator (k6, Locust, Gatling) against representative UI and API paths. Enqueue jobs at target rates and compare completion latency to backlog growth.