Skip to main content

Controlling filter and grouping paths

Introduction

When you explore data in Holistics, using a dimension with a metric means the metric gets grouped or sliced by that dimension. If the dimension and metric come from different models, Holistics uses the relationships between them to generate the proper joins alongside the grouping.

As long as a relationship path exists between two models, Holistics will find a way to join them. This is usually what you want, but sometimes combining a dimension and metric from different models doesn't make analytical sense.

This document explains how to control the filtering and grouping behavior between models.

How to control filter and grouping behavior

Relationships in a dataset have a property called filter_direction that controls which direction filters and groupings can flow between two models.

ecommerce.dataset.aml
Dataset ecommerce {
...

models: [dim_users, fact_orders]

relationships: [
relationship(fact_orders.user_id > dim_users.id, true, 'one_way')
// ^^^^^^^^
// 'one_way' or 'two_way'
// defaults to 'two_way' if not specified
]
}

Available values

ValueBehaviorWhen to use
one_wayFilters and groupings flow only from the "one" side (dimension) to the "many" side (fact).Standard dimension to fact relationships.

Use this as your default for
* Star schema or
* Galaxy schema.
two_wayFilters and groupings can flow in both directions. This is the default if not specified.* Many-to-many relationships
* 1:1 relationships
Default behavior

If you don't specify filter_direction, it defaults to two_way.

How it works

Given a relationship fact_orders.user_id > dim_users.id:

  • With one_way: You can group fact_orders metrics by dim_users dimensions, but not the reverse. The dimension can filter and segment the fact, but the fact cannot reach back to filter or segment the dimension.
  • With two_way (default): You can group in both directions. This allows more flexibility but can create unintended join paths in complex schemas.

Example use cases

When to use one_way

The most common use case for one_way is preventing invalid metric and dimension combinations in multi-fact setups like galaxy schemas. When you have multiple fact tables sharing common dimensions, bidirectional relationships can create unintended join paths that produce misleading results.

For example, in an e-commerce dataset with fact_orders and fact_inventory both connected to dim_products, you want to ensure that inventory metrics can only be grouped by product dimensions, not by user or order dimensions.

ecommerce.dataset.aml
Dataset ecommerce {
...

models: [
dim_users, dim_products, dim_cities,
dim_merchants, dim_categories,
fact_orders, fact_inventory
]

relationships: [
relationship(fact_orders.user_id > dim_users.id, true, 'one_way'),
relationship(fact_orders.product_id > dim_products.id, true, 'one_way'),
relationship(fact_inventory.product_id > dim_products.id, true, 'one_way'),
relationship(dim_users.city_id > dim_cities.id, true, 'one_way'),
relationship(dim_products.merchant_id > dim_merchants.id, true, 'one_way'),
relationship(dim_products.category_id > dim_categories.id, true, 'one_way')
]
}

For a detailed walkthrough of this scenario, see Controlling which dimensions can be used with a metric.

When to use two_way

Use two_way when you genuinely need filters and groupings to flow in both directions between two models. Common scenarios include:

Dimension to dimension analysis

When you need to analyze one dimension filtered by another dimension, traversing through a fact table, you may need two_way to allow the filter to flow in both directions.

Consider this model: dim_users → fact_orders ← dim_products. If you want to answer questions like "How many unique products (from a specific category) were purchased by each user age group?", the query needs to traverse from dim_users through the fact tables to reach dim_products.

If all relationships are set to one_way, the filter flows from dim_users to fact_orders, but stops there. It cannot continue to dim_products because that would require flowing in the reverse direction. To enable this traversal, you would need to set the relationship between fact_orders and dim_products to two_way.

ecommerce.dataset.aml
Dataset ecommerce {
...
models: [dim_users, fact_orders, dim_products]

relationships: [
relationship(fact_orders.user_id > dim_users.id, true, 'one_way'),
relationship(fact_orders.product_id > dim_products.id, true, 'two_way')
]
}

1:1 relationships

When two models have a true one-to-one relationship, there's no risk of data fan-out in either direction, so two_way is appropriate.

For example, if each user can only be the admin of one merchant, and each merchant can only have one admin user, the relationship between dim_users and dim_merchants is 1:1. Grouping merchants by user attributes or users by merchant attributes will never inflate the results.

note

For 1:1 relationships, filter_direction is always two_way and cannot be changed to one_way.

Overriding filter direction at the metric level

Sometimes you want to keep one_way as the default for safety, but allow specific metrics to traverse in the reverse direction. You can do this using with_relationships() to override the filter direction for individual metrics.

For example, say you want to answer: "What's the latest user sign-up date for each order status?" This query needs to go from fact_orders.status to dim_users.sign_up_date, which is the reverse of the normal dimension to fact flow. If the relationship is set to one_way, this query would be blocked.

Instead of changing the dataset relationship to two_way (which would affect all queries), you can override it just for this metric:

ecommerce.dataset.aml
Dataset ecommerce {
// ... models and relationships ...

metric latest_user_signup_by_order_status {
label: "Latest User Sign-up Date"
definition: @aql
max(dim_users.sign_up_date)
| with_relationships(
relationship(fact_orders.user_id > dim_users.id, true, 'two_way')
)
;;
}
}

This approach keeps the default one_way protection for all other queries while allowing this specific metric to use bidirectional traversal.


Let us know what you think about this document :)