Skip to main content

Performance and scaling

This guide helps you size web and job capacity for Strata self-hosted deployments (Docker Compose, ECS, Kamal).

Runtime roles

RoleProcessesScales with
WebThruster → Puma (WEB_CONCURRENCY workers × WEB_THREADS threads)HTTP RPS and latency
JobSolid Queue (JOB_CONCURRENCY processes × JOB_THREADS each in config/queue.yml)Job backlog and completion time

Use the same container image for both roles. Only the web role is exposed to users.

HTTP capacity (web)

Approximate concurrent request slots per web container:

slots = WEB_CONCURRENCY × WEB_THREADS

Example: WEB_CONCURRENCY=2, WEB_THREADS=510 slots per container.

For 50–200 RPS with typical Rails I/O-bound requests:

  • Start with 2–4 web replicas (ECS tasks or Compose scale) at 2×5 slots each.
  • Increase replica count before maxing threads (threads raise tail latency due to the Ruby GVL).
  • Load test your workload (Turbo pages, API, exports) and watch p95 latency and CPU.

Postgres connections (web)

Each Puma worker process has its own connection pool per database role. Pool size comes from config/database.yml: DB_POOL_SIZE, or else WEB_THREADS (default 5).

effective_pool = DB_POOL_SIZE ?? WEB_THREADS ?? 5
primary_connections_per_web_task ≈ WEB_CONCURRENCY × effective_pool

Strata uses four databases (primary, queue, cache, cable). Plan max_connections on RDS for:

total ≈ (web_tasks + job_tasks) × pools × active_roles

Leave headroom for migrations, BI tools, and replicas.

Job capacity (job service)

Approximate concurrent job execution slots per job container:

slots = JOB_CONCURRENCY × JOB_THREADS

Default JOB_THREADS is 3 (config/queue.yml). Installer default: JOB_CONCURRENCY=412 slots.

Completed jobs per second (rough estimate):

throughput ≈ concurrent_slots / average_job_duration_seconds

Example: JOB_CONCURRENCY=4 × JOB_THREADS=3 = 12 slots; 10s average job duration → ~1.2 jobs/sec per container.

Application limits

Several jobs use limits_concurrency (for example, one active query per result set). High enqueue rates do not always translate into parallel execution — validate with production-like jobs (query_jobs, export_jobs, deploy).

Scaling jobs

  1. Increase job ECS service desired count (or docker compose up --scale job=N).
  2. Increase JOB_CONCURRENCY or JOB_THREADS per task if CPU and DB connections allow.
  3. Consider dedicated job services per queue (advanced) if query jobs dominate.
ProfileWebJobNotes
Single host (Compose)1 container, WEB_CONCURRENCY=2, WEB_THREADS=51 job service, JOB_CONCURRENCY=4, JOB_THREADS=3Installer defaults
ECS (50–200 RPS)2–4 tasks, 2 vCPU / 4 GB2–4 tasks, 2 vCPU / 4 GBScale on CPU and queue depth
Stretch (high job rate)Scale web separatelyMany job tasks; benchmark before committingSee limits above

Load testing checklist

Before scaling to large fleets:

  1. Measure average and p95 job duration for query_jobs and export_jobs.
  2. Measure HTTP p95 under expected concurrent users (not just /up).
  3. Watch RDS connections, CPU, and Solid Queue ready execution count.
  4. Confirm STRATA_RUN_DB_PREPARE=false on all job tasks.
  5. Confirm production web tasks do not process jobs (run a job service; omit HANDLE_JOBS_IN_WEB_SERVER on web).

Example observability queries

Monitor Solid Queue depth in the queue database (solid_queue_ready_executions, solid_queue_jobs) and Solid Queue process heartbeats (solid_queue_processes).

Tools

Use your preferred HTTP load generator (k6, Locust, Gatling) against representative UI and API paths. Enqueue jobs at target rates and compare completion latency to backlog growth.