Skip to main content

Multi-Datasource Queries

Query across multiple databases seamlessly.

Overview

Strata supports querying across multiple datasources. The planner automatically merges results from different databases.

How It Works

  1. Universe Resolution - Planner finds paths through multiple datasources
  2. Query Routing - Routes queries to appropriate datasources
  3. Result Merging - Combines results from multiple datasources
  4. Tier-Based Selection - Prefers hot tier over cold tier

Tier Configuration

Set tier on each datasource:

# Hot tier (preferred)
hot_warehouse:
adapter: postgres
tier: hot

# Warm tier
warm_warehouse:
adapter: postgres
tier: warm

# Cold tier (last resort)
cold_archive:
adapter: athena
tier: cold

Query Routing

Planner routes queries based on:

  1. Tier - Prefers hot over warm over cold
  2. Cost - Lower cost tables preferred
  3. Partitions - Matches partition constraints
  4. Availability - Uses available datasources

Use Cases

Hot and Cold Tiers

# Recent data in hot tier
recent_sales:
datasource: hot_warehouse
cost: 10

# Historical data in cold tier
historical_sales:
datasource: cold_archive
cost: 200

Query behavior:

  • Recent queries → hot tier
  • Historical queries → cold tier
  • Cross-period queries → both tiers (merged)

Multi-Region

# US region
us_sales:
datasource: us_warehouse
tier: hot

# EU region
eu_sales:
datasource: eu_warehouse
tier: hot

Query behavior:

  • US queries → US warehouse
  • EU queries → EU warehouse
  • Global queries → both warehouses (merged)

Best Practices

  1. Set appropriate tiers - Match performance characteristics
  2. Use partitions - Help planner route queries
  3. Set costs - Lower cost for preferred datasources
  4. Test routing - Verify planner routes queries correctly

Limitations

  • No cross-datasource joins - Relationships are scoped to single datasource
  • Result merging - Results combined at query level, not join level
  • Performance - Multi-datasource queries may be slower

Next Steps