Dataset
We think it would be best if you have a clear understanding of these concepts before reading this documentation:
Introduction
In Holistics, a Dataset is a "container" holding several data models together so they can be explored together, and dictating which join path to be used in a particular analytics use case.
In other words, Dataset is like a mini data mart that enables two things:
- Data Exploration: Dataset can be shared to Explorers (non-technical users) to do self-service exploration of the data.
- Creating Charts: All Charts in Holistics have to be created from a dataset. This is done either by the Analyst or the Explorer
UI of a dataset exploration
Creating a dataset
- 3.0
- 4.0
Creating 3.0 Dataset
To create a Dataset, you can go to Datasets section under Reporting. From here you can either create datasets directly or create a folder to organize your datasets.
First, start with selecting all the Data Models that you want to add to your Dataset. Then, activate or deactivate relationships that enable the join paths you need.
In this example, since model Users, Orders and Order Items have relationships with each other, if you choose Order Items as root model, you will be able to select Users and Orders as peripheral models.
You also have the option to create a Dataset directly from Data Models to explore it
Creating 4.0 Dataset
Please refer to AML Dataset Reference to learn more about all available parameters and their example usage.
Go to Modeling 4.0 tab and create a new dataset in your AML project.
Your AML Dataset file will typically have the format my_dataset_name.dataset.aml.
.
Dataset syntax
The syntax of Dataset includes 4 main components:
- Dataset metadata: dataset labels, descriptions, owners
- Data Source reference: users' exploration activities will use this source
- Data models included
- Relationship
- AML 1.0
- AML 2.0
// 1. Import relevant AML model files to your dataset
import 'path/to/model_a.model.aml' { model_a }
import 'path/to/model_b.model.aml' { model_b }
// 2. Define your Dataset's metadata
Dataset dataset_name {
label: 'Dataset Label'
description: "Dataset description here"
owner: '[email protected]'
data_source_name: 'data_source_name'
// 3. Define your Dataset's models
models: [
model_a,
model_b
]
// 4. Define your Dataset's relationships
relationships: [
// define many-to-one relationship
relationship(model_a.field_name > model_b.field_name, true)
// define one-to-one relationship
relationship(model_a.field_name - model_b.field_name, true)
]
}
// You do not need import statements in AML 2.0
// 1. Define your Dataset's metadata
Dataset dataset_name {
label: 'Dataset Label'
description: "Dataset description here"
owner: '[email protected]'
data_source_name: 'data_source_name'
// 2. Define your Dataset's models
models: [
model_a,
model_b
]
// 3. Define your Dataset's relationships
relationships: [
// Define many-to-one relationship
relationship(model_a.field_name > model_b.field_name, true)
// Define one-to-one relationship
relationship(model_a.field_name - model_b.field_name, true)
]
}
For more information about relationship syntax, please go to this doc
Explore a dataset
With Dataset, business users can ask and answer questions themselves without relying on the analyst to write SQL at every step in their exploration.
If you want to start fresh, simply go to the Dataset section on Reporting page and click on a Dataset. You will be presented with the Explore screen where you can start dragging in fields, measures, apply conditions, and tweak the visualizations.
When you are satisfied with the exploration result, you can save it as a widget in a dashboard.
You can also explore a report / dashboard widget's result by clicking on it and choose "Explore Data". For more details, please refer to Explore Data.
Share a Dataset
Just like with your reports, dashboard or KPI metric sheets, you can share Datasets to specific users or groups. Sharing a folder will automatically share all items inside.