Semantic Model
Deep dive into Strata's semantic modeling capabilities.
Overview
The semantic model is the core of Strata. It transforms raw database tables into business-friendly analytics models that enable self-service analytics.
Critical Rules
Before diving into components, understand these non-negotiable design principles:
Every field name must be unique across your entire semantic layer.
This is a fundamental design principle (unlike tools that use namespaced fields like table.field_name).
Benefits:
- Unambiguous references: "Total Revenue" means exactly one thing
- Compound measures:
[Total Revenue] - [Total Cost]works without table prefixes - AI integration: LLMs can reference fields without disambiguation
- Consistent metrics: Organization-wide definitions
Considerations:
- Requires thoughtful naming conventions from the start
- Use prefixes or suffixes to avoid conflicts (e.g.,
order_revenue,subscription_revenue)
# These cannot coexist in the same project:
# In tbl.store_sales.yml
- name: Total Revenue # ✓ First definition
# In tbl.catalog_sales.yml
- name: Total Revenue # ✗ ERROR: Name already used
- name: Catalog Revenue # ✓ Use a unique name instead
many_to_many cardinality is not supported. This design constraint prevents double-counting bugs and ensures predictable aggregations.
Workaround: Use junction tables with two separate relationships.
# ✗ NOT SUPPORTED
users_roles:
cardinality: many_to_many
# ✓ USE JUNCTION TABLE
users_user_roles:
left: Users
right: User Roles
cardinality: one_to_many
user_roles_roles:
left: User Roles
right: Roles
cardinality: many_to_one
Why: This approach makes relationships explicit and prevents common aggregation errors where measures get counted multiple times.
Key Components
Tables
Tables define semantic metadata for physical database tables:
- Dimensions: Categorical fields for grouping
- Measures: Aggregatable metrics
- Cost: Query optimization hints
- Partitions: Data availability constraints
Fields
Fields are the building blocks of your semantic model:
- Dimensions: Attributes for grouping and filtering
- Measures: Metrics for aggregation
- Data Types: string, integer, decimal, date, etc.
- Field format: Currency, percent, dates, and display formatting
Expressions
Expressions define how fields query the database:
- SQL: Direct SQL expressions
- Lookups: Dimension field indicators
- Arrays: Multi-value fields
- Primary Keys: Unique identifiers
Relationships
Relationships define how tables join:
- Cardinality: one-to-one, one-to-many, many-to-one
- Join Types: inner (default), left, right
- Join Conditions: SQL expressions
How It Works
- Model Definition: Data engineers define tables and relationships in YAML
- Deployment: Models are deployed to Strata server
- Universe Formation: Planner generates all possible query paths
- Query Planning: User queries are converted to optimized SQL
- Execution: SQL runs against datasources
- Results: Formatted results returned to users
Advanced (in-depth and extra)
The Advanced subsection below covers deeper and optional semantic-model features: exclusions, inclusions, partitions, decorators, extended blending, cost optimization, and multi-datasource. The core basics (tables, fields, expressions, relationships, imports) are above; Advanced is for more in-depth and extra scenarios.
- Imports: Reuse field definitions across tables
- Partitions: Define data availability constraints
- Snapshot Measures: Point-in-time analysis
- Exclusions: Filter-based measure exclusions
- Inclusions: Multi-level calculations
- Decorators: Temporal, window, contribution functions
Next Steps
- Read about tables
- Explore dimensions and measures
- Learn about expressions
- Understand relationships
- Go deeper in Advanced