Skip to main content

How to Calculate Dynamic Point-In-Time Metrics in Holistics

What is a Point-in-Time Metric?

Point-in-Time (PIT) is a time metric that allows users to understand a value at a particular time. When you run a Point in Time report with a specific date, you will get a "snapshot" of what your data looked like when that date occurred. Point-in-time can be the start of a time period, the end of a time period, or any point in between.

The Business Scenario

Imagine you have the metric value of entities that last for a period of time. Each change in the value is recorded as a separate line and tagged with a start and end date of the period in which the record was current.

ID | Metric value | Valid from | Valid to

Business requirement

Now, your business users want to be able to summarize the metric value at a point in time (can be a month, a week, a day...). They also want PIT metrics can play nicely with dynamic date periods and ranges for the interactive dashboards.

Here is a specific example:

In a reporting model for bank information, we have a table of credit contracts. Each contract has multiple components and has a start and maturity date.

Most commonly, banks would like to see the outstanding balance to their customers at month ends. In this case, the bank should run an End of month report that sums up all the outstanding balances of all customers at the last day of each month.

Getting in the details

Assume you work with this simple table pit_raw which records all of your customer_id and their pipeline_value during the active period (between start_date and end_date)

Requirement:

  • Your business users want to see the total pipeline_value for all customer_id on the single date (end of whatever period selected) where it falls between start_date and end_date
  • Ability to interact with a dynamic axis which can be set to week, month, quarter, or year, so the end of the period will change depending on what is selected there.

For example, you want to have something that looks like this which calls "End of the period pipeline value". By using Metrics Sheet Visualization, you can use the "Show metrics by" function, so the end of the period will change depending on what is selected there.

High-level Mechanism

The solution needs to be dynamic so you can aggregate & report the sum value of total user_id on the last day of whatever selected period (week, month, year...), without summing all the values of days within the range.

With all that in mind, here is one example way to work all these things together using our SQL transformation model and some Holistics out-of-the-box visualization features that you can follow:

  1. First, simply transform the original data from records with 2 dates to the right data format with time dimensional based value, from which the user can select a particular date.

    For each unique (user_id, pipeline_value) combination, you'll have a row for each day the entity was valid (between the combination's start_date andend_date), and then you can just aggregate over the date to get your answer.

  2. Set target logic using Transform model and Business Calculation in your dataset exploration.

    • Basically, you will need a simple daily_sum model to aggregate the total values of all customer_id on each day. The model also needs to contain additional columns which enable you to identify the values on the last day in whatever selected period and make them get the maximum value possible in the period.
    • Create a dataset and use Business Calculations with aggregation of max to calculate the End-of-period sum of the total values.
  3. Use Metrics Sheet Visualization or other charts which is enabled for Date-drill features. These will help you to quickly change the time granularity (year, quarter, month, date) of reports.

Step-by-step Instructions

High-level Transformation Diagram

As it's not a simple transformation, we break them down into multiple steps with interim charts. The diagram below puts the steps together:

Step 1: A time spine for the metric

In this step, we'll focus on expanding the valid start-end date range to multiple rows containing dates within the range and store them into pit_prep (user_id, pipeline_value, valid_date)

A sample output table would look like this:

**| user_id |pipeline_value |valid_date|**
| A | 500 |2021-02-14|
| A | 500 |2021-02-15|
| A | 500 | ... |
| A | 500 |2021-06-04|

To do that, we need to follow 2 main steps:

  • Generate a date series that contain all possible dates (between minimum start_date and maximum end_date in your table) (using PostgreSQL's generate_series function)
  • Take your entities of the original table (user_id, pipeline_value) and join that date series on the condition that the date is between the entity’s start_date and end_date

The query:

-- date ranges for [start_date, end_date]
with date_range as (
select
generate_series( min({{ #a.start_date}}), max({{ #a.end_date}}),'1d')::date as valid_date
from
{{#pit_raw a}}
)

select
{{#a.customer_id }},
{{#a.pipeline_value }},
date_range.valid_date
from
date_range
left join {{#pit_raw a}}
on date_range.valid_date >= {{#a.start_date}} and date_range.valid_date <={{#a.end_date}}

Step 2: Create Daily Sum model

2.1. Create SQL transform model daily_sum (valid_date, sum_value, row_number) with which simply the total pipeline_value of all user_id for each day.

  • The additional column row_number enumerates the rows in the sort order defined by valid_date. With this field, using the max aggregate function gets us the rows of the last day for each period.
select 
{{ #a.valid_date }},
sum({{ #a.pipeline_value }}) as sum_value,
row_number()over(order by {{ #a.pipeline_date }}) as row_number
from
{{ #pit_prep a}}
group by 1
order by 1

2.2. In daily_sum model, create 2 custom dimensions (sum_plus_rnx10bn, rnx10bn) as below:

Model daily_sum {

--Based Field
field valid_date date
field sum_value integer
field row_number integer

-- Custom Field
rnx10bn = "(row_number*10000000000)"
sum_plus_rnx10bn = "(row_number*10000000000+sum_value)"
}

Since you want to be able to swap in different levels of granularity of the End-of-period metrics in your report, you need to make the last value in whatever selected period get the maximum value possible in the period.

Based on the 2 custom dimensions, create a new field with aggregation of max in your report which takes it down to the last day's value in the period selected.

  • Note: 10 billion is an arbitrarily large value here to extend the row numbering to where it shouldn't ever touch the daily quantity value

Step 3: Create Business Calculations for End-of-period metrics

Sounds like your data is modeled properly, add the daily_sum model to the Point_in_timedataset and it will be ready to be explored.

Create a business calculation with aggregation of max call End of period value, which takes it down to the total values of the last day in the period selected in the visualization.

End_of_period_value = max(sum_plus_rnx10bn) - max(rnx10bn)

Final step: Building the chart

Now you can throw any of the End-of-period metrics and other aggregated metrics together in a report using the valid_date field as an axis.

To get them to play nicely with dynamic date periods and ranges for interactive dashboards, you can apply Metrics sheet visual or other charts that are enabled for Date-drill features.

  • By applying Metrics sheet visual, you can use Show Metric by (month, quarter, year...) and Number of Columns in visualization settings to change Aggregation Period and Time range will be shown in the chart.
  • On the other hand, we also can apply other charts which are enabled for Date-drill features such as Line, bar, area, column... chart.





Let us know what you think about this document :)