Performance and scaling

This guide helps you size web and job capacity for Strata self-hosted deployments (Docker Compose, ECS, Kamal).

Runtime roles

Role	Processes	Scales with
Web	Thruster → Puma (`WEB_CONCURRENCY` workers × `WEB_THREADS` threads)	HTTP RPS and latency
Job	Solid Queue (`JOB_CONCURRENCY` processes × `JOB_THREADS` each in `config/queue.yml`)	Job backlog and completion time

Use the same container image for both roles. Only the web role is exposed to users.

HTTP capacity (web)

Approximate concurrent request slots per web container:

slots = WEB_CONCURRENCY × WEB_THREADS

Example: WEB_CONCURRENCY=2, WEB_THREADS=5 → 10 slots per container.

For 50–200 RPS with typical Rails I/O-bound requests:

Start with 2–4 web replicas (ECS tasks or Compose scale) at 2×5 slots each.
Increase replica count before maxing threads (threads raise tail latency due to the Ruby GVL).
Load test your workload (Turbo pages, API, exports) and watch p95 latency and CPU.

Postgres connections (web)

Each Puma worker process has its own connection pool per database role. Pool size comes from config/database.yml: DB_POOL_SIZE, or else WEB_THREADS (default 5).

effective_pool = DB_POOL_SIZE ?? WEB_THREADS ?? 5
primary_connections_per_web_task ≈ WEB_CONCURRENCY × effective_pool

Strata uses four databases (primary, queue, cache, cable). Plan max_connections on RDS for:

total ≈ (web_tasks + job_tasks) × pools × active_roles

Leave headroom for migrations, BI tools, and replicas.

Job capacity (job service)

Approximate concurrent job execution slots per job container:

slots = JOB_CONCURRENCY × JOB_THREADS

Default JOB_THREADS is 3 (config/queue.yml). Installer default: JOB_CONCURRENCY=4 → 12 slots.

Completed jobs per second (rough estimate):

throughput ≈ concurrent_slots / average_job_duration_seconds

Example: JOB_CONCURRENCY=4 × JOB_THREADS=3 = 12 slots; 10s average job duration → ~1.2 jobs/sec per container.

Application limits

Several jobs use limits_concurrency (for example, one active query per result set). High enqueue rates do not always translate into parallel execution — validate with production-like jobs (query_jobs, export_jobs, deploy).

Scaling jobs

Increase job ECS service desired count (or docker compose up --scale job=N).
Increase JOB_CONCURRENCY or JOB_THREADS per task if CPU and DB connections allow.
Consider dedicated job services per queue (advanced) if query jobs dominate.

Recommended starting points

Profile	Web	Job	Notes
Single host (Compose)	1 container, `WEB_CONCURRENCY=2`, `WEB_THREADS=5`	1 `job` service, `JOB_CONCURRENCY=4`, `JOB_THREADS=3`	Installer defaults
ECS (50–200 RPS)	2–4 tasks, 2 vCPU / 4 GB	2–4 tasks, 2 vCPU / 4 GB	Scale on CPU and queue depth
Stretch (high job rate)	Scale web separately	Many job tasks; benchmark before committing	See limits above

Load testing checklist

Before scaling to large fleets:

Measure average and p95 job duration for query_jobs and export_jobs.
Measure HTTP p95 under expected concurrent users (not just /up).
Watch RDS connections, CPU, and Solid Queue ready execution count.
Confirm STRATA_RUN_DB_PREPARE=false on all job tasks.
Confirm production web tasks do not process jobs (run a job service; omit HANDLE_JOBS_IN_WEB_SERVER on web).

Example observability queries

Monitor Solid Queue depth in the queue database (solid_queue_ready_executions, solid_queue_jobs) and Solid Queue process heartbeats (solid_queue_processes).

Tools

Use your preferred HTTP load generator (k6, Locust, Gatling) against representative UI and API paths. Enqueue jobs at target rates and compare completion latency to backlog growth.

Runtime roles​

HTTP capacity (web)​

Postgres connections (web)​

Job capacity (job service)​

Application limits​

Scaling jobs​

Recommended starting points​

Load testing checklist​

Example observability queries​

Tools​

Related docs​