Skip to main content

where vs. filter

Knowledge Checkpoint

This documentation assumes that you are familiar with the following functions:

In AQL, we have the where() and the filter() functions that perform roughly the same function (apply a filtering condition onto an object), so you may wonder which one to use in a particular situation. In this document, we will explain when and where you should use which function.

Quick comparison

A quick rule-of-thumb to follow when deciding between where() and filter():

Function
Target
Valid filtering conditionWhen to use
where()Measure
  • dimension operator value. E.g. orders.status == 'delivered', orders.created_at matches @(last 7 days)
  • dimension in [value1, value2, ...]. E.g. orders.status in ['delivered', 'cancelled']
  • dimension in table. E.g. orders.status in unique(orders.status)
  • dimension operator measure. E.g. users.age > avg(users.age)
Apply filters to a Measure
filter()TableAny expression that returns truefalse and is valid in the context of the table rowFilter a Table

Examples

Suppose that you have an Ecommerce dataset with the following models:

Model orders {
dimension id {}
dimension status {}
dimension country {}
dimension user_id {}
dimension value {}
}

In the examples below, we will demonstrate the difference on how where() and filter() are used.

Only where() can be used

Suppose you have defined the total_orders measure inside orders model:

Model orders {
...

measure total_orders {
definition: @aql count(orders.id)
}
}

Now you want to calculate “the total orders which are delivered.” You can use where() to apply filter to the measure:

orders.total_orders | where(orders.status == 'delivered') // valid expression

orders.total_orders | filter(orders.status == 'delivered') // invalid expression

By definition, filter() cannot be used here.

Only filter() can be used

filter() can be used when you want to filter by arbitrary expression that haven't been defined as a dimension yet. For example, you want to know which country has at least 100,000 users who have placed at least 1 order:

orders
| group(orders.country)
| select(orders.country, users_count: count_distinct(orders.user_id))
| filter(users_count >= 100000) // where here will be invalid

In this case, where() cannot be used in place of filter(), because by definition, where() have to filter on an existing dimension of orders, while the field users_count is an aggregation that has not been defined and calculated before.


Let us know what you think about this document :)