Handling Nested JSON Data in Holistics
Often times when you load data from NoSQL source into Holistics, the data will be stored as JSON/nested object and nested array. This document outlines different approaches for working with them effectively.
If you use MongoDB and haven't loaded the data into a SQL database, please check out this MongoDB Reporting guide.
High-level Approach
JSON objects typically have 2 types of data structures that we need to deal with:
- Non-array JSON
- JSON array
If you are dealing with non-array JSON, the solution can be either:
- Create Custom Field that accesses the specific value inside the nested object, or
- Flatten the nested object into a separate model using Holistics SQL Query Model
If you are dealing with a JSON array, the solution is to unnest the array into a separate table, and join them along with the original table. We can do this using Holistics SQL Query Model feature as well.
Handling non-array JSON objects
With with non-array JSON, we can either:
- Create Custom Field that extracts specific values from the JSON object, or
- Flatten the nested object into a separate model using Holistics SQL Query Model.

Creating Custom Field for a specific JSON property
Using Holistics Modeling, you can create a Custom Dimension that references inside the non-array JSON column to the property you want. This field will then be exposed to others to use.
- 3.0
- 4.0
In the example below, we extract the height
field out from field size
.
{"h":14,"w":21,"uom":"cm"}
Formula for field height: ({{ #THIS.size }} ->> 'h')::numeric

Here is a step-by-step guide to create custom fields from JSON objects in Holistics 4.0.
Given the column info that contains JSON objects storing order information. Your data may look like this:
{ "customer": "Raju Kumar", "items": {"product": "coffee", "qty": 6}}
- (Optional) Create a Table Model from your table.

- In your model, extract JSON properties into Custom Fields using this syntax.
// Extract the first field, customer's name
dimension name {
label: 'Name'
type: 'text'
hidden: false
definition: @sql ({{ #SOURCE.info }} ->> 'customer') ;;
}
// This is used to access nested item
dimension order {
label: 'Order'
type: 'text'
hidden: false
definition: @sql ({{ #SOURCE.info }} -> 'items' ->> 'product') ;;
}
// This syntax is used to cast your data into numeric values
dimension quantity {
label: 'Quantity'
type: 'text'
hidden: false
definition: @sql ({{ #SOURCE.info }} -> 'items' ->> 'qty')::numeric ;;
}
- Your result should look like this.

Click to see full code example
// public_orders.model.aml
Model public_orders {
type: 'table'
label: 'Orders'
description: ''
data_source_name: 'demo'
dimension name {
label: 'Name'
type: 'text'
hidden: false
definition: @sql ({{ #SOURCE.info }} ->> 'customer') ;;
}
dimension order {
label: 'Order'
type: 'text'
hidden: false
definition: @sql ({{ #SOURCE.info }} -> 'items' ->> 'product') ;;
}
dimension quantity {
label: 'Quantity'
type: 'text'
hidden: false
definition: @sql ({{ #SOURCE.info }} -> 'items' ->> 'qty')::numeric ;;
}
table_name: '"public"."orders"'
}
Flattening your nested JSON into a separate table
A second approach would be writing a SQL model to hold your flattening logic, this basically creates a new, derived table from the original table with the flatten fields.
- 3.0
- 4.0
select
{{#inv.id}},
({{#inv.size}} ->> 'w')::numeric as width,
({{#inv.size}} ->> 'h')::numeric as height,
({{#inv.size}} ->> 'uom') as unit
from {{ #inventories_proper inv }}

Given that your orders table contains a column info that is of JSON type storing order information. Let's flatten order_info column into a separate Query Model (Query Model), order_info.

- Create a Table Model from your table if you haven't already.

- (Important) By default, the auto-generated JSON field will have the type composite. As Holistics 4.0 does not support this data type at the moment, change the type of your JSON field into text.

- Let's build a Query Model (Query Model) from this Table Model. The syntax for this model is as below.
select
({{ #public_orders1.info }} ->> 'customer') as name,
({{ #public_orders1.info }} -> 'items' ->> 'product')as product,
({{ #public_orders1.info }} -> 'items' ->> 'qty')::numeric as quantity
from {{ #public_orders1 }};;

- You should be able to successfully create the
order_info
Query Model that contains the flattened JSON object.

Click to see full code example
// order_info.model.aml
// This is a Query Model
import 'path/to/table/model/in/step/one' { public_orders as model__public_orders }
Model order_info {
type: 'query'
label: 'Order Info'
description: ''
data_source_name: 'elephant'
dimension name {
label: 'Name'
type: 'text'
hidden: false
definition: @sql {{ #SOURCE.name }};;
}
dimension product {
label: 'Product'
type: 'text'
hidden: false
definition: @sql {{ #SOURCE.product }};;
}
dimension quantity {
label: 'Quantity'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.quantity }};;
}
owner: '[email protected]'
query: @sql
select
({{ #public_orders.info }} ->> 'customer') as name,
({{ #public_orders.info }} -> 'items' ->> 'product')as product,
({{ #public_orders.info }} -> 'items' ->> 'qty')::numeric as quantity
from {{ #public_orders }};;
models: [
model__public_orders
]
}
Working with array JSON field
When you have an array JSON field, what we should do is to normalize the data, by bringing the JSON array into a proper relational table.
- 3.0
- 4.0
For example, you have a table model containing blog posts and the comments made to it. The tags
field is a JSON array:

Let's say you want to know "How many blog posts are the for each tag?".
In this case, you will need to write a SQL model (let's call it posts_tags_unnested
) to unnest the array so that each tag value is placed on a separate row. Post ID values will be repeated:
select
{{#p.id}} as post_id,
tags_unnested #>>'{}' as tags
from
{{ #posts p}},
json_array_elements({{#p.tags}}) as tags_unnested
Add relationships, and then create a dataset out of the original model and the newly created SQL model. Note that the relationship between posts.id
and posts_tags_unnested.post_id
is many to one:

To answer the question: "How many blog posts there are for each tag?", simply drag in the unnested Tags field and count distinct [Post ID] at the Dataset Explore screen.
Here's the full animated video of the steps:
Given the column info that contains JSON objects storing information of people's posts. Your data may look like this:
'{ "author": "Helen", "tags": ["home-grooming","beauty"]}’
'{ "author": "Mary", "tags": ["healthy","lifestyle"]}’
'{ "author": "Bob", "tags": ["healthy","lifestyle"]}’
Note that this JSON has a tags
field that contains a JSON array. Let's transform the data so that we can answer the question: "How many blog posts there are for each tag?".
- Create a Table Model from your table if you haven't already.

- (Important) By default, the auto-generated JSON field will have the type composite. As Holistics 4.0 does not support this data type at the moment, change the type of your JSON field into text.

- Next, extract the JSON array from the tags field into a Custom Dimension using this syntax below.
dimension tags {
label: 'Tags'
type: 'text'
hidden: false
definition: @sql
select
({{ #SOURCE.info }} -> 'tags') as tags;;
}
- Let's build a Query Model (Query Model) to unnested the values in the JSON array of the newly-created dimension above. The syntax for this model is as below.
select
{{ #public_posts.id }} as post_id,
tags_unnested #>>'{}' as tags
from
{{ #public_posts }},
json_array_elements({{ #public_posts.tags }} :: json) as tags_unnested

- Finally, create a Dataset that contains the Query Model created in step 5
post_info_unnested.model.aml
and the original Table Model in step 1public_posts.model.aml
.
Then, set-up the relationship between posts.id
and posts_info_unnested.post_id
to be many to one by adding this RelationshipConfig into the Dataset definition.

relationships: [
RelationshipConfig {
rel: Relationship {
type: 'many_to_one'
from: r(public_posts.id)
to: r(post_info_unnested.post_id)
}
active: true
}
]
- Let's see this in action. To answer the question: "How many blog posts there are for each tag?", simply drag in the unnested Tags field and count distinct [Post ID] at the Dataset Explore screen.

Click to see full code example for the Table Model used in the tutorial
// public_post.model.aml
// This is the Table Model
Model public_posts {
type: 'table'
label: 'Posts'
description: ''
data_source_name: 'elephant'
dimension id {
label: 'Id'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.id }};;
}
dimension info {
label: 'Info'
type: 'text'
hidden: false
definition: @sql {{ #SOURCE.info }};;
}
dimension tags {
label: 'Tags'
type: 'text'
hidden: false
definition: @sql
select
({{ #SOURCE.info }} -> 'tags') as tags;;
}
owner: '[email protected]'
table_name: '"public"."posts"'
}
Click to see full code example for the Query Model used in the tutorial
// post_info_unnested.model.aml
// This is the Query Model
import '/path/to/public_posts.model.aml' { public_posts as model__public_posts }
Model post_info_unnested {
type: 'query'
label: 'Post Info Unnested'
description: ''
data_source_name: 'elephant'
dimension post_id {
label: 'Post Id'
type: 'number'
hidden: false
definition: @sql {{ #SOURCE.post_id }};;
}
dimension tags {
label: 'Tags'
type: 'text'
hidden: false
definition: @sql {{ #SOURCE.tags }};;
}
owner: '[email protected]'
query: @sql
select
{{#p.id}} as post_id,
tags_unnested #>>'{}' as tags
from
{{ #public_posts p }},
json_array_elements({{#p.tags}}) as tags_unnested;;
models: [
model__public_posts
]
}
Click to see full code example for the Dataset used in the tutorial
// posts.dataset.aml
// This is the Dataset
import '/path/to/public_posts.model.aml' { public_posts }
import '/path/to/post_info_unnested.model.aml' { post_info_unnested }
Dataset posts {
label: 'Posts'
description: ''
data_source_name: 'elephant'
models: [
post_info_unnested,
public_posts
]
relationships: [
RelationshipConfig {
rel: Relationship {
type: 'many_to_one'
from: r(public_posts.id)
to: r(post_info_unnested.post_id)
}
active: true
}
]
owner: '[email protected]'
}