Holistics Expression

Definition

Holistics Expression is our propriety language that allows you to specify data definition, transformation, calculations,... in a database-agnostic way. In other words, it abstracts away the little differences between SQL flavors and lets you focus on what really matters: your data logic.

Why Holistics Expression is necessary

Firstly, in comparison to SQL, Holistics Expression serves more complex business use cases. For example, you can:

  • Create a period-over-period growth percentage
  • Analyze cumulative sum from the selected range
  • Create custom fields combining data from multiple models instead of just the current model
  • Apply Column Level Permission easily
  • And so much more

Next, if you are working with multiple databases with different SQL dialects, Holistics Expression will help you to create dimension/measure flexibly with unified syntax that functions properly with all of your databases (and of course, with our system as well).

Where to use Holistics Expression

Holistics Expression aims to support both Data Analysts and Explorers to efficiently create calculations thus fulfilling their analytics needs. However, while Analysts work primarily in the modeling layer and prepare reusable measures/calculations, Explorers, on the other hand, view prepared reports and only explore or create calculations on-demand, so that Holistics has introduced two places where Holistics Expression can be created and used.

Business Calculation

Business Calculations can be created when exploring Dataset or Reports and Explorers are its target users. You can imagine it's like calculating 2 or multiple fields in an Excel Sheet. Its formula can be adjusted easily right in Dataset Exploration View.

Model Field Expression

In contrast, Model Field Expression can only be created in a specific model and used when exploring the related model in Dataset. The formula of Model Field Expression cannot be modified in Dataset Exploration View.

How Holistics Expression works

Holistics Expression provides a unified syntax against different SQL dialects for defining any measure (metric) on top of a set of models inside a dataset. For a simple measure which only involves a single model, a measure expression work just like an aggregate function in SQL. Behind the scene, when you use a measure in a dataset, it's translated to a simple SQL expression and put into the final query.

With more complicated measures that involve more models, since we know what you are computing, which models and relationships are involved, we can dynamically prepare these dependencies in different parts of the final query with respect to correctness and performance.

Supported Functions

Aggregators

Aggregators are functions that group values of multiple rows into a single summary value. They are equivalent to aggregation functions that SQL supports (SUM, COUNT, AVG, MAX, MIN,...).

Currently available functions:

ExpressionReturn TypeDescriptionExample
count(field)Whole NumberCounts the total number of items in a group, not including NULL values.COUNT(orders.id)
count_distinct(field)Whole NumberCounts the total number of distinct items in a group, not including NULL values.COUNTD(orders.id)
average(field)VaryAverages the values of items in a group, not including NULL values.AVG(orders.id)
min(field)VaryComputes the item in the group with the smallest numeric value.MIN(order_item.quantity)
max(field)VaryComputes the item in the group with the largest numeric value.MAX(order_item.quantity)
sum(field)NumberSums the total number of items in a group, not including NULL values.SUM(order_item.quantity)
median(field)NumberComputes the median of an expression, which is the value that the values in the expression are below 50% of the time.MEDIAN(order_item.quantity)
stdev(field)NumberReturns the standard deviation (sample) of the column created by expression unless expression defines a column of lists, in which case returns the standard deviation (sample) of each list.
stdevp(field)NumberReturns the standard deviation (population) of the column created by expression unless expression defines a column of lists, in which case returns the standard deviation (population) of each list.
var(field)NumberReturns the variance (sample) of the column created by expression unless expression defines a column of lists, in which case returns the variance (sample) of each list.
varp(field)NumberReturns the variance (population) of the column created by expression unless expression defines a column of lists, in which case returns the variance (population) of each list.

Logical Function

Logical functions return value based on some logical conditions.

Currently available functions:

ExpressionReturn typeDescription
case whenvaryThe CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement).
and()booleanLogical AND compares between two Booleans as expression and returns true when both expressions are true...
or()booleanLogical OR compares two Booleans as expression and returns true when one of the expressions is true...
not()booleannot takes a single Boolean as an argument and invert it.
in()booleanin takes a field expression and a list of values. Return true if that list of values contains the value of that field expression.

case when

The CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.

Syntax

case(when: condition_expression, then: value_expression, else: value_expression)

Example

case(
when: users.gender == 'm', then: 'male',
when: users.gender == 'f', then: 'female',
else: 'others'
)

SQL output

Case
When users.gender = 'm' then 'male'
When users.gender = 'f' then 'female'
Else 'others'
End

Return

gendercase
mmale
ffemale
mmale

and()

Logical AND compares between two Booleans as expression and returns true when both expressions are true...

Syntax

and(condition_expression, ...)

Example

and(
products.id >= 2,
products.id <= 8
)

SQL output

(
(products.id >= 2.0) AND (products.id <= 8.0)
)

Return

idand
1false
2true
8true
9false

or()

Logical OR compares two Booleans as expression and returns true when one of the expressions is true...

Syntax

or(condition_expression, ...)

Example

or(
products.id <= 2,
products.id >= 8
)

SQL output

(
(products.id <= 2.0) OR (products.id >= 8.0)
)

Return

idor
1true
4false
7false
9true

not()

not takes a single Boolean as an argument and invert it.

Syntax

not(field_expression)

Example

not(is(products.id, null))

SQL output

NOT (products.id IS NULL)

Return

idnot
1true
false
3true
4true

in()

in takes a field expression and a list of values. Return true if that list of values contains the value of that field expression.

Syntax

in(field_expression, value_expression, value...)

Example

in(users.name, 'bob', 'alice', 'jack)

SQL output

user.name in ('bob', 'alice', 'jack)

Return

namein
bobtrue
alicetrue
peterfalse

Filter Functions (Coming Soon)

The filter function is a list of filter expressions that are applied to a measure calculation. It only works with measure types that perform aggregation.

Time intelligence functions (Coming Soon)

This type of function is used to process date/time value and support both approaches:

  • Truncate a TIMESTAMP or an INTERVAL value based on a specified date part e.g., hour, week, or month
  • Extracts a subfield from a date or time value
ExpressionReturn TypeDescription
date_trunc(datetime, time_col: 'year', 'quarter', 'month', 'week', 'day', 'hour', 'minute')datetimeTruncate a TIMESTAMP on secific date part
now()datetimeFunction that returns current time
day()numberExtract the day from the same timestamp
month()numberExtract the month from the same timestamp
year()numberExtract the year from the same timestamp
quarter()numberExtract the quarter from the same timestamp
week()numberExtract the week from the same timestamp
hour()numberExtract the hour from the same timestamp
minute()numberExtract the minute from the same timestamp
week_day()numberReturn the day number of a specific date (within a week)
week_num()numberReturn the week number of a specific date (within a year)

FAQs

Holistics Expression is case sensitive

Question: Can I use CASE(WHEN:...,THEN:...,ELSE:...), AND(), OR(),...

Answer: No, Since Holistics Expression is case sensitive and we don't support capitalized letters in our Expression so the exact needs to be followed

  • case(when:...,then:...,else:...)
  • and()
  • or()
  • ...

How to create calculation with only a subset of my current data

Question: How can I create a calculation with only a subset of my current data (using condition inside an aggregate function).
For example, from the eCommerce dataset, what if I want to calculate the total value from the delivered orders only (exclude all other cancelled and refunded orders)

Answer: Since Filter function inside an Aggregate function is currently not supported, we recommend that at this moment, you can combine measure function with case when inside to calculate the data with any specific condition being applied.

sum(
case(
when: order_derived.order_status == 'delivered'
, then: order_derived.item_value
, else: null
)
)

How to handle error Divide by 0

Question: When doing division in Business Calculation (field_a/field_b), sometimes I encounter division by zero error which is obviously because my Divisor = 0. How should I handle this case?

Answer: In this case, you should add conditional expression in your divisor to return NULL whenever it has the value of 0

sum(model.field_a)
/
case(
when: sum(model.field_b) == 0
, then: null
, else: sum(model.field_b)
)