Datasources

Configure database connections and understand tier behavior.

What are Datasources?

Datasources connect Strata to your databases. They define:

Database adapter (PostgreSQL, Snowflake, MySQL, etc.)
Connection details (host, port, database, schema)
Authentication credentials
Tier (hot, warm, cold) — influences query routing and performance

For CLI commands to add, test, and manage datasources, see Managing Datasources.

Tier Configuration

Strata supports three tiers. They control which datasource is preferred when the same data or compatible tables exist in more than one place, and how query cost is estimated.

hot

Primary datasource for fast, frequently accessed data. Highest priority for query routing.

Use for:

Primary data warehouse
Main OLAP or analytics database
Data that must respond quickly

warehouse:
  adapter: postgres
  tier: hot
  # ... connection details

warm

Secondary datasource for less frequently accessed or moderately slower data. Used when a hot-tier source is unavailable or when the planner chooses it for cost or availability.

Use for:

Replicas or read replicas
Archived but still reasonably fast data
Secondary databases

archive:
  adapter: postgres
  tier: warm
  # ... connection details

cold

Archive or slow-access data. Used as a last resort when hot or warm tiers cannot satisfy the query.

Use for:

S3/Athena or object-store–backed query engines
Historical or infrequently queried data
Data lakes with higher latency

historical:
  adapter: athena
  tier: cold
  # ... connection details

Adding a Datasource

Via CLI

strata datasource add [ADAPTER]

Examples: strata datasource add postgres, strata datasource add snowflake, strata datasource add mysql.

The CLI prompts for: datasource key, display name, adapter-specific settings, credentials, and tier.

Manual Configuration

Edit datasources.yml:

warehouse:
  adapter: postgres
  name: Production Warehouse
  description: Main data warehouse
  tier: hot
  query_timeout: 3600
  host: warehouse.example.com
  port: 5432
  database: analytics
  schema: public
  ssl: true
  username: strata_user
  # password: (set via strata datasource auth)

Setting Credentials

Credentials are stored in .strata (gitignored):

strata datasource auth warehouse

Testing Connections

strata datasource test warehouse

Verifies connection parameters, authentication, and database accessibility.

Supported Adapters

PostgreSQL — postgres
Snowflake — snowflake
MySQL — mysql
SQL Server — sqlserver
Athena — athena
Trino — trino
DuckDB — duckdb
Druid — druid

Semantic model and data source scoping

Each table in your semantic model belongs to exactly one datasource:

Relationships are defined within a datasource; cross-datasource joins are not supported.
Universes and query plans are built per datasource, then optionally merged at the result level (see Multi-Datasource).
Complex/compound measures and automatic data blending operate inside a single datasource – you cannot build one measure that mixes fields from Snowflake and Postgres, for example.

In a single Strata project you can still:

Model different domains in different warehouses (for example, marketing in BigQuery via Trino and finance in Snowflake).
Run multi-datasource queries where compatible measures are computed in each datasource and merged, as long as each individual measure respects the single-datasource rule.

Runtime engines and SQLite

Strata targets modern analytical databases as runtime engines:

Production deployments typically use engines like PostgreSQL, Snowflake, Athena, Trino, or similar warehouses.
Lightweight engines like DuckDB are great for local development, tutorials, and integration tests.

While it is technically possible to point a datasource at SQLite or other single-node engines, this is best reserved for demos or experimentation:

Many advanced SQL features (JSON/JSONB, window functions, date/time helpers) behave differently or are missing.
Performance and concurrency characteristics are usually not suitable for shared production workloads.

For production, prefer a fully supported warehouse adapter and treat SQLite-style engines as development conveniences only.

Query Timeout

warehouse:
  adapter: postgres
  query_timeout: 3600  # seconds (1 hour)
  # ... connection details

Best Practices

Use descriptive keys — e.g. warehouse, not db1
Set tiers to match performance — hot for primary, cold for S3/Athena-style sources
Store credentials in .strata — never commit them
Test connections — run strata datasource test before deploy
Set timeouts — avoid unbounded long-running queries
Use SSL in production

Security

Credentials in .strata (gitignored)
File permissions 0600 (owner only)
API keys not accepted on the command line
Remote credentials stored encrypted on the server

Auth and row-level security (enterprise / forward-looking)

For enterprise deployments, Strata is designed to work alongside warehouse-level security controls:

Per-user database credentials (for example, Snowflake OAuth, SSO, or delegated tokens) can be used so that queries run under each user’s own identity.
Row-level security (RLS) policies defined in the warehouse continue to apply, and semantic measures should be modeled assuming those policies are in effect.

Exact configuration options for per-user auth and advanced security vary by adapter and may evolve over time. When in doubt:

Keep sensitive access rules in the warehouse itself.
Use the semantic model for consistent metric definitions and query routing, not as a replacement for RLS.

Next Steps

Managing Datasources — CLI commands
Adapters — Adapter-specific configuration
multi-datasource — Cross-datasource queries
cost-optimization

What are Datasources?​

Tier Configuration​

hot​

warm​

cold​

Adding a Datasource​

Via CLI​

Manual Configuration​

Setting Credentials​

Testing Connections​

Supported Adapters​

Semantic model and data source scoping​

Runtime engines and SQLite​

Query Timeout​

Best Practices​

Security​

Auth and row-level security (enterprise / forward-looking)​

Next Steps​

What are Datasources?

Tier Configuration

hot

warm

cold

Adding a Datasource

Via CLI

Manual Configuration

Setting Credentials

Testing Connections

Supported Adapters

Semantic model and data source scoping

Runtime engines and SQLite

Query Timeout

Best Practices

Security

Auth and row-level security (enterprise / forward-looking)

Next Steps