Skip to main content

Dataset

Knowledge Checkpoint

A grasp of these concepts will help you understand this documentation better:

Introduction

In Holistics, a Dataset is a "container" holding several data models together so they can be explored together, and dictating which join path to be used in a particular analytics use case.

In other words, Dataset is like a mini data mart that enables two things:

  • Data Exploration: Dataset can be shared to Explorers (non-technical users) to do self-service exploration of the data.
  • Creating Charts: All Charts in Holistics have to be created from a dataset. This is done either by the Analyst or the Explorer
UI of a dataset exploration

Create a dataset

Go to Modeling 4.0 tab, click on the + symbol, select Add Dataset.

A new screen will appear, and here you can add more details to your datasets and select models to be included.

After clicking Create Dataset, the new dataset will appear in your the folder tree as a file with the following format: my_dataset_name.dataset.aml..

Dataset view modes

When working on a dataset, you will be presented with three view modes:

  • List mode: This mode displays information about the dataset visually, and provides graphical UI for basic actions like adding/removing models and relationships.

  • Code mode: In this mode, you will see the code representation of all the elements in the List mode. Through this mode, you can define more complex features of the dataset, such as Dataset Views and Dataset Metrics. All of your changes in List mode will be reflected to the Code mode. and vice versa.

  • Data mode : In this mode, you can explore the data with the Data Exploration UI to quickly check if your dataset definitions work as expected.

Dataset definition syntax

info

Please refer to AML Dataset Reference to learn more about all available parameters and their example usage.

In general, a dataset has the following components:

  1. Dataset metadata: dataset labels, descriptions, owners, and the data source that the dataset will query from.
  2. Data models included in the model
  3. Relationships between data models
  4. Metrics: definition of cross-model aggregations

Putting together, a dataset definition has the following form:

// 1. Define your Dataset's metadata
Dataset dataset_name {
label: 'Dataset Label'
description: "Dataset description here"
owner: '[email protected]'
data_source_name: 'data_source_name'

// 2. Define your Dataset's models
models: [
model_a,
model_b
]

// 3. Define your Dataset's relationships
relationships: [
// Define many-to-one relationship
relationship(model_a.field_name > model_b.field_name, true)

// Define one-to-one relationship
relationship(model_a.field_name - model_b.field_name, true)
]

// 4. Define AQL Metrics here
metric aql_metric1 {
label: 'AQL Metric Label'
type: 'text' | 'number' | 'datetime' | 'date' | 'truefalse'
description: 'AQL Metric description'
definition: @aql
// AQL definition here
;;
}

metric aql_metric2 {
// Metric definition here
}
}

Explore a dataset

With Dataset, business users can ask and answer questions themselves without relying on the analyst to write SQL at every step in their exploration.

If you want to start fresh, simply go to the Dataset section on Reporting page and click on a Dataset. You will be presented with the Explore screen where you can start dragging in fields, measures, apply conditions, and tweak the visualizations.

When you are satisfied with the exploration result, you can save it as a widget in a dashboard.

You can also explore a report / dashboard widget's result by clicking on it and choose "Explore Data". For more details, please refer to Explore Data.

Share a Dataset

Just like with your reports, dashboard or KPI metric sheets, you can share Datasets to specific users or groups. Sharing a folder will automatically share all items inside.


Let us know what you think about this document :)