Skip to main content

Create Cohort Retention

Knowledge Checkpoint

A grasp of these concepts will help you understand this documentation better:

Introduction

Cohort Retention Analysis analyzes the continued engagement of specific user groups over time, helping businesses understand and improve customer loyalty and long-term user activity.

We will provide a step-by-step walkthrough on how to build a report in Holistics that demonstrates the classic cohort retention.

Video Tutorial

Setup

In order to gain insights into the lifetime value of groups of users who were acquired for each particular period of time, our objective is to define and analyze Acquisition Cohorts.

For this tutorial, we will be using an e-commerce dataset that consists of two models: orders and users.

Before diving into the implementation, let's take a quick look at our dataset setup.

// orders.model.aml
Model orders {
...
dimension id {...}
dimension user_id {}
dimension created_at {}
}
// users.model.aml
Model users {
...
dimension id {...}
dimension name {...}
}

// e_commerce.dataset.aml
Dataset e_commerce {
...
models: [orders, users]
relationships: [
relationship(orders.user_id > users.id, true)
]

}

Sample data

High-level flow

  1. Define Acquisition Cohort: Determine the period and timeframe when each customer was acquired. This period could be daily, monthly, yearly, or any other suitable duration based on your business needs.
  2. Define the metric: Define the metric you want to observe for each cohort
  3. Define the retention logic: How many users of the cohort are still active in subsequent months

Implementation

1. Define Acquisition Cohort

Let’s define the acquisition cohort as the month that users made their first purchase.

Since 1 user can make multiple purchases on different days, to find each user's first order date, obtain the earliest order created date (in the Orders model) and dimensionalize it by associating it with the User ID in the Users model. This can be done by using min() function and dimensionalize().

Dataset e_commerce {
(...)

dimension acquisition_month_cohort {
model: users
type: 'date'
label: 'Acquisition Month Cohort'
definition: @aql min(orders.created_at | month()) | dimensionalize(users.id);;
}
}

Acquisition Cohort

2. Define the Metrics (or the Cohort Size)

You will then be able to find out how many users in the Acquisition Cohort.

First, calculate the Total Number of Users by simply using count() function.

Dataset e_commerce {
(...)

dimension acquisition_month_cohort {...}

metric total_users {
label: 'Total Users'
type: 'number'
definition: @aql count(users.id);;
}

}

And then, to find out how many users are in this Acquisition Cohort, you will use the combination of dimension Acquisition Month Cohort with the metric Total Users

Total Users Cohort

3. Define the retention logic

info

What is considered retention:

  • If first-time user A made an order on Week 1, and returns to buy more stuffs the next weeks, she is a returned user.
  • If user B also made an order on Week 1 and does not return the following weeks, she's a bounced user, that basically means you lose her as a user.

First, calculate the Month since Acquisition by using the date_diff() function.

This dimension is necessary to determine how many users from a specific acquisition cohort return in subsequent months.

Dataset e_commerce {
(...)

dimension acquisition_month_cohort {...}

metric total_users {...}

dimension month_no {
model: orders
label: 'Month Number'
type: 'number'
definition: @aql date_diff('month', users.acquisition_month_cohort, (orders.created_at | month()));;
}

}

This can be best illustrated by using Pivot Table where

  • The Row is the Dimension Acquisition Month Cohort
  • The Column is the Dimension Month Number
  • The Values is the Metric Total Users

Retained Users

Month Number 0 represents the Cohort Size, which calculates the total number of users who made their first purchase in the Acquisition Month Cohort.

Month Numbers 1, 2, and 3 show how many users from that cohort continue to make purchases in the subsequent months 1, 2, and 3, respectively.

Finally, create a retention metric to measure the percentage of total users retained in each acquisition cohort for each month.

Dataset e_commerce {
(...)

dimension acquisition_month_cohort {...}

metric total_users {...}

dimension month_no {...}

metric retention {
label: 'Retention'
type: 'number'
definition: @aql (total_users*1.0) / (total_users | of_all(orders.month_no));;
}

}

Percent Cohort Retention


We have covered all the foundational concepts required to calculate metrics for the acquisition cohort, and you are now equipped to create even more powerful cohort analysis reports to present to your stakeholders in Holistics.


Let us know what you think about this document :)