Moving Average
A grasp of these concepts will help you understand this documentation better:
Introduction
Moving Average is a powerful analytical tool used to help identifying trends in time series data. By averaging the values of a variable over a rolling window, we can smooth out fluctuations and help the actual trend stands out better.
In this tutorial, we will create a report that visualizes both the monthly revenue and the 3-month-moving average revenue of an E-commerce business using the trailing_period()
and window_avg()
functions in Holistics.
Setup
In this guide, we will use the familiar e_commerce
dataset:
- Initial Setup
- Final Setup
// orders.model.aml
Model orders {
...
dimension id {...}
dimension user_id {...}
dimension created_at {...}
}
// order_items.model.aml
Model order_items.model.aml {
...
dimension order_id {...}
dimension product_id {...}
dimension quantity {...}
}
// products.model.aml
Model products {
...
dimension id {...}
dimension price {...}
dimension categor_id {...}
}
// e_commerce.dataset.aml
Dataset e_commerce {
...
models: [orders, order_items, products]
relationships: [
relationship(order_items.order_id > orders.id, true),
relationship(order_items.product_id > products.id, true)
]
}
// orders.model.aml
Model orders {
...
dimension id {...}
dimension user_id {...}
dimension created_at {...}
}
// order_items.model.aml
Model order_items.model.aml {
...
dimension order_id {...}
dimension product_id {...}
dimension quantity {...}
}
// products.model.aml
Model products {
...
dimension id {...}
dimension price {...}
dimension categor_id {...}
}
// e_commerce.dataset.aml
Dataset e_commerce {
...
models: [orders, order_items, products]
relationships: [
relationship(order_items.order_id > orders.id, true),
relationship(order_items.product_id > products.id, true)
]
metric revenue {
label: 'Revenue'
type: 'number'
definition: @aql coalesce(order_items | sum(order_items.quantity * ecommerce_products.price), 0);;
}
// Use trailing_period
metric moving_avg_revenue_3m_tp {
label: 'trailing_period - 3M Moving Avg. Revenue'
type: 'number'
definition: @aql trailing_period(revenue, ecommerce_orders.created_at, interval(3 months)) / 3;;
}
// Use window_avg
metric moving_avg_revenue_3m_wa {
label: 'window_avg - 3M Moving Avg. Revenue'
type: 'number'
definition: @aql window_avg(revenue, -2..0, order: ecommerce_orders.created_at | month());;
}
}
High-level Flow
- Create a normal aggregation metric: We will create a
revenue
metric in the dataset which will be used later in the definition of the moving metric. - Create a moving average metric:
-
Using the
trailing_period()
Function: This function turns therevenue
metric into a moving aggregation that sums revenues over intervals of 3 months. The sum is then divided by 3 to obtain the average value. -
Using the
window_avg()
Function: This function performs a secondary average over the calculatedrevenue
metric. We must ensures that therevenue
metric is sliced (grouped) and ordered by the Created Month dimension, with a frame of previous 2 rows and the current row.
- Visualize
The key difference between trailing_period()
and window_avg()
is that, trailing_period()
works at the time-series level, and ensures that the aggregation is done over a specified time window, regardless the table rows. You can see more details about how it works here.
On the other hand, window_avg()
works at the table rows level, and we must ensure that the table rows contain the correct data for the secondary aggregation to be meaningful. You can see more details about how it works here.
Thus,trailing_period
can handle gap in data. A trailing period measure of 2 months will use the calendar months to calculate the moving average. On the other hand, the window_avg function will use the previous row and the current row to calculate the moving average.
For example, in the following table with gaps in data, the trailing_period
function will return different results than the window_avg
function:
Month | Revenue | trailing_period | window_avg |
---|---|---|---|
Jan | 100 | 100 | 100 |
Mar | 100 | 50 | 100 |
Jun | 100 | 50 | 100 |
Another difference is that trailing_period
can handle the case where the time dimension are in different grains (e.g. month vs day), while window_avg
requires the time dimension to be in the same grain for the calculation to be meaningful.
trailing_period
are also calculated before visual filters are applied, meaning if you data was filtered to only show data from 2022, the average for the first 2 month would still be calculated over the full 3 months if available. window_avg
on the other hand will only calculate the average over the rows that are visible in the table.