AML Dataset
Knowledge Checkpoint
A grasp of these concepts will help you understand this documentation better:
Introduction
In Holistics, datasets are defined in .dataset.aml
files. The full dataset file name has the form dataset_name.dataset.aml
. The dataset definition typically contains the following information:
- Dataset metadata: dataset labels, descriptions, owners
- Data Source reference: users' exploration activities will use this source
- Data models included
- Relationships
- Metrics
- Dataset view definition
The following section will list all current dataset parameters.
Parameter definition
Parameter name | Description |
---|---|
import | Add other files to the current dataset file (Deprecated in AML 2.0) |
dataset | Specify the dataset's unique name in the workspace |
label | Specifies how the dataset will appear in the Ready-to-explore Dataset |
description | Add dataset description |
owner | Define who should be in charge of managing the current dataset |
data_source_name | Specify the database that Holistics will execute the generated query against (in the dataset) |
relationships | Specify relationship and their configuration among added models |
models | Specify which models will be used in the dataset |
view | Define how models and fields are displayed in Preview / Dataset Exploration |
dimension | Define cross-model dimensions in the dataset |
metric | Define metrics to be used in the dataset |
Dataset syntax example
Below is a sample dataset with all parameters filled out:
Dataset simple_dataset {
label: '[Demo] Ecommerce'
description: 'Demo dataset for E-commerce use cases test'
owner: '[email protected]'
data_source_name: 'demodb'
models: [
ecommerce_orders,
ecommerce_order_items,
ecommerce_users,
ecommerce_products,
ecommerce_categories
]
relationships: [
relationship(ecommerce_orders.user_id > ecommerce_users.id, true),
relationship(ecommerce_order_items.order_id > ecommerce_orders.id, true),
relationship(ecommerce_order_items.product_id > ecommerce_products.id, true),
relationship(ecommerce_products.category_id > ecommerce_categories.id, true)
]
// Dimensions definition
dimension full_name {
model: users
type: 'text'
label: 'Full name'
definition: @aql concat(users.first_name, ' ', users.last_name);;
}
dimension age_by_year {
model: users
type: 'text'
label: 'Full name'
definition: @aql date_diff('day', users.birth_date, @now) / 365;;
}
// Metrics definition
metric count_orders {
label: 'Count Orders'
type: 'number'
definition: @aql ecommerce_orders.id | count() ;;
}
metric sum_order_value {
label: 'Sum Order Values'
type: 'number'
definition: @aql ecommerce_order_items | sum(ecommerce_order_items.quantity * ecommerce_products.price) ;;
}
metric average_order_value {
label: 'Average Order Value'
type: 'number'
definition: @aql sum_order_value / count_orders;;
}
// View definition
view {
model ecommerce_orders { }
model ecommerce_users { }
// Create group "Order Master" containing two models Orders and Dates
group relevant_models {
model ecommerce_products { }
model ecommerce_categories { }
}
group business_metrics {
metric sum_order_value
metric average_order_value
}
}
}
Example dataset UI