Skip to main content

Controlling Which Dimensions Can Be Used With a Metric

Introduction

In a well-designed analytical setup, each metric should only be broken down by dimensions that are semantically relevant to it. While many fields may be technically joinable, only a subset of those combinations preserve the intended meaning of a metric.

This document shows how you can explicitly control which dimensions can be used with each metric in Holistics.

For example, in a typical multi-fact (galaxy schema) setup shown below, order-related metrics and inventory-related metrics each have their own distinct set of compatible dimensions.

Problem: Invalid metric breakdowns

In complex modeling setup, it’s possible to run queries that are technically valid but analytically misleading.

In Holistics, these cases often show up as metric breakdowns that return results successfully, yet no longer represent a meaningful business concept. This typically happens when a metric is segmented by dimensions that are outside the context of the fact it belongs to.

For example, inventory-related metrics are not tied to individual orders or users. Breaking them down by order, user, or city attributes therefore does not make sense from a business perspective. However, in a multi-fact setup where fact tables share common dimensions, such combinations can still be queried and will return results.

Multi-fact model where querying a metric from one fact model and a dimension to other fact model

Imagine a user asking: "What is the total inventory quantity available by user email?". This question doesn't make sense, as inventory exists at the product level, not the user level. Yet with the default relationship setup, they can drag Total Quantity Available (from fact_inventory) and User Email (from dim_users) into a report, and the system will return results.

The most dangerous aspect of these queries is that they fail silently. Because the query executes successfully and produces plausible-looking output, end users may assume the results are correct, even though the breakdown itself is fundamentally invalid.

Why this happens: Bidirectional relationship

In Holistics, a dimension from one model can be combined with a metric from another model as long as there's a path of relationships connecting them, regardless of how many models sit in between.

Take dim_users and fact_inventory as an example. A path exists between them:

dim_users → fact_orders → dim_products → fact_inventory

Because this path exists, the system allows you to break down inventory metrics by any dimensions from dim_users.

This is technically valid, but analytically wrong.

By default, relationships in Holistics are bidirectional, so filters and groupings can flow in either direction. This lets the system traverse from fact_orders through dim_products into fact_inventory, creating an unintentional cross-fact path.

The issue is that inventory isn't tied to users. One product can have multiple inventory records (across warehouses, for example), so grouping inventory by user inflates the results. The query runs successfully and the numbers look plausible, but they're misleading.

The Solution: Single-directional relationships

To prevent invalid cross-fact paths, you can control the direction in which filters and groupings flow between models using the filter_direction property on relationships.

Setting filter_direction to one_way allows dimensions to filter and group metrics in fact models, but blocks the reverse. This means dim_products can filter fact_inventory, but fact_orders cannot traverse through dim_products to reach fact_inventory.

The guiding principle: dimensions describe facts, not the other way around.

Applying This to a Multi-Fact Example

Consider an e-commerce dataset with two fact tables:

  • fact_orders
  • fact_inventory

And shared dimensions like users, products, merchants, and categories.

To fix this, add 'one_way' to all dimension-to-fact and dimension-chain relationships. By default, relationships have no filter_direction specified, which means they default to two_way. Adding 'one_way' restricts the filter flow to one direction only.

ecommerce.dataset.aml
Dataset ecommerce {
...

models: [
dim_users, dim_products, dim_cities,
dim_merchants, dim_categories,
fact_orders, fact_inventory
]

relationships: [
relationship(fact_orders.user_id > dim_users.id, true, 'one_way'),
relationship(fact_orders.product_id > dim_products.id, true, 'one_way'),
relationship(fact_inventory.product_id > dim_products.id, true, 'one_way'),
relationship(dim_users.city_id > dim_cities.id, true, 'one_way'),
relationship(dim_products.merchant_id > dim_merchants.id, true, 'one_way'),
relationship(dim_products.category_id > dim_categories.id, true, 'one_way')
]
}

With 'one_way' set on all relationships:

  • fact_orders metrics can only be grouped by dimensions that directly describe orders
  • fact_inventory metrics can only be grouped by dimensions that directly describe inventory
  • No accidental paths exist between fact tables

As a result:

  • ✅ Valid breakdowns continue to work
  • ❌ Invalid breakdowns are blocked by design
  • ❌ Cross-fact joins are no longer possible

This turns semantic correctness into a default guarantee, not a guideline.


Let us know what you think about this document :)