Skip to main content

AML Dataset

Knowledge Checkpoint

A grasp of these concepts will help you understand this documentation better:

Introduction

In Holistics, datasets are defined in .dataset.aml files. The full dataset file name has the form dataset_name.dataset.aml. The dataset definition typically contains the following information:

  • Dataset metadata: dataset labels, descriptions, owners
  • Data Source reference: users' exploration activities will use this source
  • Data models included
  • Relationships
  • Metrics
  • Dataset view definition

The following section will list all current dataset parameters.

Parameter definition

Parameter nameDescription
importAdd other files to the current dataset file (Deprecated in AML 2.0)
datasetSpecify the dataset's unique name in the workspace
labelSpecifies how the dataset will appear in the Ready-to-explore Dataset
descriptionAdd dataset description
ownerDefine who should be in charge of managing the current dataset
data_source_nameSpecify the database that Holistics will execute the generated query against (in the dataset)
relationshipsSpecify relationship and their configuration among added models
modelsSpecify which models will be used in the dataset
viewDefine how models and fields are displayed in Preview / Dataset Exploration
dimensionDefine cross-model dimensions in the dataset
metricDefine metrics to be used in the dataset

Dataset syntax example

Below is a sample dataset with all parameters filled out:

Dataset simple_dataset {
label: '[Demo] Ecommerce'
description: 'Demo dataset for E-commerce use cases test'
owner: '[email protected]'
data_source_name: 'demodb'

models: [
ecommerce_orders,
ecommerce_order_items,
ecommerce_users,
ecommerce_products,
ecommerce_categories
]

relationships: [
relationship(ecommerce_orders.user_id > ecommerce_users.id, true),
relationship(ecommerce_order_items.order_id > ecommerce_orders.id, true),
relationship(ecommerce_order_items.product_id > ecommerce_products.id, true),
relationship(ecommerce_products.category_id > ecommerce_categories.id, true)
]

// Dimensions definition
dimension full_name {
model: users
type: 'text'
label: 'Full name'
definition: @aql concat(users.first_name, ' ', users.last_name);;
}

dimension age_by_year {
model: users
type: 'text'
label: 'Full name'
definition: @aql date_diff('day', users.birth_date, @now) / 365;;
}

// Metrics definition
metric count_orders {
label: 'Count Orders'
type: 'number'
definition: @aql ecommerce_orders.id | count() ;;
}

metric sum_order_value {
label: 'Sum Order Values'
type: 'number'
definition: @aql ecommerce_order_items | sum(ecommerce_order_items.quantity * ecommerce_products.price) ;;
}

metric average_order_value {
label: 'Average Order Value'
type: 'number'
definition: @aql sum_order_value / count_orders;;
}

// View definition
view {
model ecommerce_orders { }
model ecommerce_users { }

// Create group "Order Master" containing two models Orders and Dates
group relevant_models {
model ecommerce_products { }
model ecommerce_categories { }
}

group business_metrics {
metric sum_order_value
metric average_order_value
}
}
}

Example dataset UI

dataset-aml

Let us know what you think about this document :)