Order of Operations
Sometimes you expect operations to be carried in a particular order, but the Order Of Operations dictates a different order which may produce unexpected results. We use operations to refer to actions such as applying filters, calculating dimension/metrics that are performed by Holistics as you explore data or build reports.
This article presents the concept of Order Of Operations, walks you through a particular scenario where it may lead to unexpected results, and how you can get the correct results.
Order of Operations
Here's a high-level overview of the Order of Operations. This sequence of operations is in action whenever you retrieve data or perform calculations.
- Create model Common Table Expressions.
- Apply query params (for query models only).
- Execute AQL dim (if AQL dim is used).
- Ignore excluded dimensions from filters.
- Apply filters to dimensions that are not excluded.
- Execute aggregation/metric logic.
- Apply any filters on measures/metrics (created via UI).
Let's walk through an example to demonstrate how this concept instantiates.
Example: total orders per Category for a specific Merchant
Let's say that you run an Ecommerce store with the following data setup:
You want to calculate a Total Orders metric that's grouped by Category and Merchant. You also want to compare it against the Total Orders for each Category across all Merchants.
- Total Orders is a simple
count()
. - Total Orders for a given Category independently of any Merchants: use the
of_all
LOD function to ignore the Merchant dimension when performingcount()
.
// Total Orders
count(orders.id)
// Total Orders Per Category
count(orders.id) | of_all(merchants.name)
// this ignores "Merchant name" dimension being used in the exploration
The Total Orders metric is grouped automatically by Category Name and Merchant Name (the dimensions that we are using in this exploration). The Total Orders Per Category metric is grouped by Category Name, but ignoring Merchant Name as dictated by of_all
.
Problem: Filters on Merchant name dimension still show empty Merchant data
Let's say you want to see data only for a particular Merchant named Abernathy Group
, and you apply a filter on the Merchant Name dimension to do so. You may expect Holistics to only display rows for that particular Merchant, but the result also shows categories with empty Merchant names.
What happened? Here, we see the Order of Operations concept in action.
- Create model CTE.
- Apply query params (for query models only).
- Execute AQL dim (if AQL dim is used).
- Ignore excluded dimensions from filters: This step is ignoring the filter on the dimension Merchant Name, because the Total Order Per Category metric is excluding the Merchant Name dimension (using the
of_all
function). - Apply filters to dimensions that are not excluded.
- Execute aggregation/metric logic.
- Apply any filters on measures/metrics (created via UI).
If an aggregation contains LOD functions that exclude certain dimensions (such as of_all
being used on Merchant Name), then any filter on these dimensions will be ignored as well.
This explains why you still see categories displayed even though there is no Merchant named “Abernathy Group” that sells anything in these categories.
What does the generated query roughly look like? Step 4 explains why we still see categories for which no merchants named Abernathy Group
is present.
-- Step 1: Get orders specifically for "Abernathy Group"
WITH orders_for_abernathy AS (
SELECT orders.id AS order_id
FROM ecommerce.order_items
LEFT JOIN ecommerce.orders ON order_items.order_id = orders.id
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.merchants ON products.merchant_id = merchants.id
WHERE merchants.name = 'Abernathy Group'
GROUP BY orders.id
),
-- Step 2: Get order items and categories specifically for "Abernathy Group"
abernathy_categories AS (
SELECT order_items.order_id,
categories.name AS category_name,
merchants.name AS merchant_name
FROM ecommerce.order_items
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.categories ON products.category_id = categories.id
LEFT JOIN ecommerce.merchants ON products.merchant_id = merchants.id
WHERE merchants.name = 'Abernathy Group'
GROUP BY order_items.order_id, categories.name, merchants.name
),
-- Step 3: Count orders by Category specifically for "Abernathy Group"
category_orders_abernathy AS (
SELECT abernathy_categories.category_name,
abernathy_categories.merchant_name,
COUNT(orders.id) AS order_count
FROM ecommerce.orders
INNER JOIN orders_for_abernathy ON orders.id = orders_for_abernathy.order_id
LEFT JOIN abernathy_categories ON orders.id = abernathy_categories.order_id
GROUP BY abernathy_categories.category_name, abernathy_categories.merchant_name
),
-- Step 4: Count all orders per Category, regardless of Merchant
all_category_orders AS (
SELECT categories.name AS category_name,
COUNT(orders.id) AS total_orders
FROM ecommerce.order_items
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.categories ON products.category_id = categories.id
LEFT JOIN ecommerce.orders ON order_items.order_id = orders.id
GROUP BY categories.name
)
-- Final Step: Combine counts for "Abernathy Group" and all merchants
SELECT COALESCE(category_orders_abernathy.category_name, all_category_orders.category_name) AS category_name,
category_orders_abernathy.merchant_name,
MAX(category_orders_abernathy.order_count) AS total_orders_abernathy,
MAX(all_category_orders.total_orders) AS total_orders_all_merchants
FROM category_orders_abernathy
FULL JOIN all_category_orders ON category_orders_abernathy.category_name = all_category_orders.category_name
GROUP BY category_name, category_orders_abernathy.merchant_name
LIMIT 10000;
Solution 1: Create a metric and apply filters on that metric
How to leverage the Order of Operations concept to achieve what we want? Instead of using a filter on dimensions, we can create a new metric that simply refers to the Merchant Name, and apply a filter on this metric instead.
This produces the correct result because the Order of Operations dictates that filters on measures or metrics get applied to the final result.
- Create model CTE.
- Apply query params (for query models only).
- Execute AQL dim (if AQL dim is used).
- Ignore excluded dimensions from filters.
- Apply filters to dimensions that are not excluded.
- Execute aggregation.
- Apply any filters on measures/metrics (created via UI): This applies filters on the metric "Merchant Name Metric", which filters out rows at which the metric does not evaluate to "Abernathy Group".
Here’s roughly the generated SQL. Notice that the filter on the metric is added at the end of the query.
-- Step 1: List order items with associated categories and merchants
WITH all_order_items AS (
SELECT
order_items.order_id,
categories.name AS category_name,
merchants.name AS merchant_name
FROM
ecommerce.order_items
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.categories ON products.category_id = categories.id
LEFT JOIN ecommerce.merchants ON products.merchant_id = merchants.id
GROUP BY order_items.order_id, categories.name, merchants.name
),
-- Step 2: Count orders by Category and Merchant
category_order_counts AS (
SELECT
all_order_items.category_name,
all_order_items.merchant_name,
COUNT(orders.id) AS order_count
FROM
ecommerce.orders
LEFT JOIN all_order_items ON orders.id = all_order_items.order_id
GROUP BY all_order_items.category_name, all_order_items.merchant_name
),
-- Step 3: Count total orders by Category without Merchant filtering
total_orders_by_category AS (
SELECT
categories.name AS category_name,
COUNT(orders.id) AS total_order_count
FROM
ecommerce.order_items
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.categories ON products.category_id = categories.id
LEFT JOIN ecommerce.orders ON order_items.order_id = orders.id
GROUP BY categories.name
),
-- Step 4: Combine "Abernathy Group" specific data with all merchants data
combined_counts AS (
SELECT
COALESCE(category_order_counts.category_name, total_orders_by_category.category_name) AS category_name,
category_order_counts.merchant_name,
MAX(category_order_counts.order_count) AS total_orders,
MAX(total_orders_by_category.total_order_count) AS total_users_per_category,
category_order_counts.merchant_name AS merchant_name_metric
FROM
category_order_counts
FULL JOIN total_orders_by_category ON category_order_counts.category_name = total_orders_by_category.category_name
GROUP BY category_name, category_order_counts.merchant_name
)
-- Final selection filtered for "Abernathy Group" in Merchant Name Metric
SELECT
combined_counts.category_name,
combined_counts.merchant_name AS merchant_name,
combined_counts.total_orders,
combined_counts.total_users_per_category,
combined_counts.merchant_name_metric
FROM
combined_counts
WHERE
combined_counts.merchant_name_metric = 'Abernathy Group'
LIMIT 10000;
Solution 2: Incorporate filter into the metric
If you don't want to create a separate metric, then you can rewrite the Total Orders Per Category as followed:
case(
when: merchants.name == 'Abernathy Group'
, then: count(orders.id) | of_all(merchants.name)
, else: null)
This AQL expression calculates total orders only for Merchant with the specified name. For Merchants with other names, it'll return null. Finally, you can apply a filter on this metric to specify that it must be not null.
Here’s the concept of Order of Operations in action again:
- Create model CTE.
- Apply query params (for query models only).
- Execute AQL dim (if AQL dim is used).
- Ignore excluded dimensions from filters.
- Apply filters to dimensions that are not excluded.
- Execute aggregation/metric. This step calculates the metric Total Orders Per Category, and returns a number only for merchants with name "Abernathy Group". Otherwise, it will return null.
- Apply any filters on measures/metrics (created via UI): This step filters out rows in which the metric is evaluated to null.
Similarly, here's roughly the generated query.
-- Step 1: List order items with associated categories and merchants
WITH all_order_items AS (
SELECT
order_items.order_id,
categories.name AS category_name,
merchants.name AS merchant_name
FROM
ecommerce.order_items
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.categories ON products.category_id = categories.id
LEFT JOIN ecommerce.merchants ON products.merchant_id = merchants.id
GROUP BY order_items.order_id, categories.name, merchants.name
),
-- Step 2: Count orders by Category and Merchant
category_order_counts AS (
SELECT
all_order_items.category_name,
all_order_items.merchant_name,
COUNT(orders.id) AS order_count
FROM
ecommerce.orders
LEFT JOIN all_order_items ON orders.id = all_order_items.order_id
GROUP BY all_order_items.category_name, all_order_items.merchant_name
),
-- Step 3: Count total orders by Category without Merchant filtering
total_orders_by_category AS (
SELECT
categories.name AS category_name,
COUNT(orders.id) AS total_order_count
FROM
ecommerce.order_items
LEFT JOIN ecommerce.products ON order_items.product_id = products.id
LEFT JOIN ecommerce.categories ON products.category_id = categories.id
LEFT JOIN ecommerce.orders ON order_items.order_id = orders.id
GROUP BY categories.name
),
-- Step 4: Combine counts for "Abernathy Group" and all merchants, setting total_users_per_cate only if Merchant is "Abernathy Group"
combined_counts AS (
SELECT
COALESCE(category_order_counts.category_name, total_orders_by_category.category_name) AS category_name,
category_order_counts.merchant_name,
MAX(category_order_counts.order_count) AS total_orders,
CASE
WHEN category_order_counts.merchant_name = 'Abernathy Group' THEN MAX(total_orders_by_category.total_order_count)
ELSE NULL
END AS total_users_per_cate
FROM
category_order_counts
FULL JOIN total_orders_by_category ON category_order_counts.category_name = total_orders_by_category.category_name
GROUP BY category_name, category_order_counts.merchant_name
)
-- Final selection filtered to show only rows where "Abernathy Group" has total_users_per_cate
SELECT
combined_counts.category_name,
combined_counts.merchant_name,
combined_counts.total_orders,
combined_counts.total_users_per_cate
FROM
combined_counts
WHERE
combined_counts.total_users_per_cate IS NOT NULL
LIMIT 10000;
This query first calculates the Total Order metric and returns a number only for the specified Merchants, otherwise it returns null. The condition at the final query filters out rows where Total Order metric is null.
FAQ
Q: Are filters on dimensions and metrics executed at the same time because they're defined in the same place on UI?
No, filters on dimensions are applied before aggregation is calculated, and filters on metrics or measures are applied within the aggregation itself, and on the final result. While they are defined in the same place in UI, they are executed at different times.
Q: Is there a way to filter data before any dimension is computed?
Yes, you can use Query Model and set up Query Params to filter data before any dimension or metric is computed.