Skip to main content

AML PreAggregate

PreAggregate maps your model's fields to columns in a pre-aggregated table, enabling Holistics to automatically use that table for matching queries instead of scanning the raw data.

PreAggregate objects live inside the pre_aggregates block of a Dataset. They can also be declared standalone (outside the dataset) for reuse with AML Extend.

For a conceptual introduction, see Aggregate Awareness.

Syntax

Dataset ecommerce {
// ...

pre_aggregates: {
<name>: PreAggregate {
dimension <dim_name> {
for: r(model.field)
time_granularity: 'day' // optional, for date/datetime fields
}
measure <measure_name> {
for: r(model.field)
aggregation_type: 'count'
}
persistence: FullPersistence { schema: 'persisted' } // or ExternalPersistence
}
}
}

Dimension mapping

Each dimension block maps a model field to a column in the pre-aggregated table. The dimension name must match the column name in the pre-aggregated table.

ParameterRequirementDescription
forrequiredReference to the source model field using r(model.field) syntax.
time_granularityoptionalFor date/datetime fields, the granularity this pre-aggregate was built at. Holistics uses this table for queries at that granularity or coarser. Accepted values: 'year', 'quarter', 'month', 'week', 'day', 'hour', 'minute'.

Measure mapping

Each measure block maps an aggregation to a column in the pre-aggregated table. The measure name must match the column name in the pre-aggregated table.

ParameterRequirementDescription
forrequiredReference to the source model field using r(model.field) syntax.
aggregation_typerequiredThe aggregation function used when building the pre-aggregated table. Accepted values: 'count', 'count_distinct', 'sum', 'avg', 'max', 'min'.

Persistence

The persistence parameter tells Holistics where the pre-aggregated table lives.

ExternalPersistence

Use this when the table already exists in your warehouse (built by dbt, Airflow, or any external process).

persistence: ExternalPersistence {
table_name: 'your_schema.your_aggregated_table'
}
ParameterDescription
table_nameFully qualified table name in your data warehouse.
important

Dimension and measure names in your PreAggregate config must exactly match the column names in your external table.

FullPersistence and IncrementalPersistence

Use these to let Holistics create and refresh the table automatically. They accept the same parameters as model persistence.

persistence: FullPersistence {
schema: 'persisted'
}

// or for incremental refresh:
persistence: IncrementalPersistence {
schema: 'persisted'
incremental_column: 'created_at'
}

See Pre-aggregate Persistence for setup instructions and scheduling.

Example

Dataset ecommerce {
models: [transactions]

pre_aggregates: {
agg_transactions_daily: PreAggregate {
dimension created_at {
for: r(transactions.created_at)
time_granularity: 'day'
}
dimension status {
for: r(transactions.status)
}
dimension country {
for: r(transactions.country)
}
measure count_transactions {
for: r(transactions.id)
aggregation_type: 'count'
}
measure total_revenue {
for: r(transactions.revenue)
aggregation_type: 'sum'
}
persistence: FullPersistence {
schema: 'persisted'
}
}
}
}

Standalone declaration with Extend

PreAggregate can be declared outside of a Dataset as a named object, then extended to create variants. This avoids repeating shared measures and persistence across multiple granularities.

PreAggregate agg_base {
measure count_transactions {
for: r(transactions.id)
aggregation_type: 'count'
}
persistence: IncrementalPersistence {
schema: 'persisted'
incremental_column: 'created_at'
}
}

Dataset ecommerce {
pre_aggregates: {
agg_daily: agg_base.extend({
dimension created_at { for: r(transactions.created_at), time_granularity: 'day' }
}),
agg_monthly: agg_base.extend({
dimension created_at { for: r(transactions.created_at), time_granularity: 'month' }
})
}
}

See Build multiple pre-aggregates for a full walkthrough.

See also


Open Markdown
Let us know what you think about this document :)