Holistics Expression is our propriety language that allows you to specify data definition, transformation, calculations,... in a database-agnostic way. In other words, it abstracts away the little differences between SQL flavors and lets you focus on what really matters: your data logic.
Why Holistics Expression is necessary
Firstly, in comparison to SQL, Holistics Expression serves more complex business use cases. For example, you can:
- Create a period-over-period growth percentage
- Analyze cumulative sum from the selected range
- Create custom fields combining data from multiple models instead of just the current model
- Apply Column Level Permission easily
- And so much more
Next, if you are working with multiple databases with different SQL dialects, Holistics Expression will help you to create dimension/measure flexibly with unified syntax that functions properly with all of your databases (and of course, with our system as well).
Where to use Holistics Expression
Holistics Expression aims to support both Data Analysts and Explorers to efficiently create calculations thus fulfilling their analytics needs. However, while Analysts work primarily in the modeling layer and prepare reusable measures/calculations, Explorers, on the other hand, view prepared reports and only explore or create calculations on-demand, so that Holistics has introduced two places where Holistics Expression can be created and used.
Business Calculations can be created when exploring Dataset or Reports and Explorers are its target users. You can imagine it's like calculating 2 or multiple fields in an Excel Sheet. Its formula can be adjusted easily right in Dataset Exploration View.
Model Field Expression
In contrast, Model Field Expression can only be created in a specific model and used when exploring the related model in Dataset. The formula of Model Field Expression cannot be modified in Dataset Exploration View.
How Holistics Expression works
Holistics Expression provides a unified syntax against different SQL dialects for defining any measure (metric) on top of a set of models inside a dataset. For a simple measure which only involves a single model, a measure expression work just like an aggregate function in SQL. Behind the scene, when you use a measure in a dataset, it's translated to a simple SQL expression and put into the final query.
With more complicated measures that involve more models, since we know what you are computing, which models and relationships are involved, we can dynamically prepare these dependencies in different parts of the final query with respect to correctness and performance.
Aggregators are functions that group values of multiple rows into a single summary value. They are equivalent to aggregation functions that SQL supports (SUM, COUNT, AVG, MAX, MIN,...).
Currently available functions:
|count(field)||Whole Number||Counts the total number of items in a group, not including NULL values.||COUNT(orders.id)|
|count_distinct(field)||Whole Number||Counts the total number of distinct items in a group, not including NULL values.||COUNTD(orders.id)|
|average(field)||Vary||Averages the values of items in a group, not including NULL values.||AVG(orders.id)|
|min(field)||Vary||Computes the item in the group with the smallest numeric value.||MIN(order_item.quantity)|
|max(field)||Vary||Computes the item in the group with the largest numeric value.||MAX(order_item.quantity)|
|sum(field)||Number||Sums the total number of items in a group, not including NULL values.||SUM(order_item.quantity)|
|median(field)||Number||Computes the median of an expression, which is the value that the values in the expression are below 50% of the time.||MEDIAN(order_item.quantity)|
|stdev(field)||Number||Returns the standard deviation (sample) of the column created by expression unless expression defines a column of lists, in which case returns the standard deviation (sample) of each list.|
|stdevp(field)||Number||Returns the standard deviation (population) of the column created by expression unless expression defines a column of lists, in which case returns the standard deviation (population) of each list.|
|var(field)||Number||Returns the variance (sample) of the column created by expression unless expression defines a column of lists, in which case returns the variance (sample) of each list.|
|varp(field)||Number||Returns the variance (population) of the column created by expression unless expression defines a column of lists, in which case returns the variance (population) of each list.|
Logical functions return value based on some logical conditions.
Currently available functions:
|case when||vary||The CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement).|
|and()||boolean||Logical AND compares between two Booleans as expression and returns true when both expressions are true...|
|or()||boolean||Logical OR compares two Booleans as expression and returns true when one of the expressions is true...|
|not()||boolean||not takes a single Boolean as an argument and invert it.|
|in()||boolean||in takes a field expression and a list of values. Return true if that list of values contains the value of that field expression.|
The CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.
AND compares between two Booleans as expression and returns true when both expressions are true...
Logical OR compares two Booleans as expression and returns true when one of the expressions is true...
not takes a single Boolean as an argument and invert it.
in takes a field expression and a list of values. Return true if that list of values contains the value of that field expression.
Filter Functions (Coming Soon)
The filter function is a list of filter expressions that are applied to a measure calculation. It only works with measure types that perform aggregation.
Time intelligence functions (Coming Soon)
This type of function is used to process date/time value and support both approaches:
- Truncate a TIMESTAMP or an INTERVAL value based on a specified date part e.g., hour, week, or month
- Extracts a subfield from a date or time value
|date_trunc(datetime, time_col: 'year', 'quarter', 'month', 'week', 'day', 'hour', 'minute')||datetime||Truncate a TIMESTAMP on secific date part|
|now()||datetime||Function that returns current time|
|day()||number||Extract the day from the same timestamp|
|month()||number||Extract the month from the same timestamp|
|year()||number||Extract the year from the same timestamp|
|quarter()||number||Extract the quarter from the same timestamp|
|week()||number||Extract the week from the same timestamp|
|hour()||number||Extract the hour from the same timestamp|
|minute()||number||Extract the minute from the same timestamp|
|week_day()||number||Return the day number of a specific date (within a week)|
|week_num()||number||Return the week number of a specific date (within a year)|
Holistics Expression is case sensitive
Question: Can I use
Answer: No, Since Holistics Expression is case sensitive and we don't support capitalized letters in our Expression so the exact needs to be followed
How to create calculation with only a subset of my current data
Question: How can I create a calculation with only a subset of my current data (using condition inside an aggregate function).
For example, from the eCommerce dataset, what if I want to calculate the total value from the delivered orders only (exclude all other cancelled and refunded orders)
Filter function inside an
Aggregate function is currently not supported, we recommend that at this moment, you can combine
measure function with
case when inside to calculate the data with any specific condition being applied.
How to handle error Divide by 0
Question: When doing division in Business Calculation (field_a/field_b), sometimes I encounter
division by zero error which is obviously because my Divisor = 0. How should I handle this case?
Answer: In this case, you should add conditional expression in your divisor to return NULL whenever it has the value of 0