AML Dataset
A grasp of these concepts will help you understand this documentation better:
Introduction
In Holistics, datasets are defined in .dataset.aml files. The full dataset file name has the form dataset_name.dataset.aml. The dataset definition typically contains the following information:
- Dataset metadata: dataset labels, descriptions, owners
- Data Source reference: users' exploration activities will use this source
- Data models included
- Relationships
- Metrics
- Dataset view definition
The following section will list all current dataset parameters.
Parameter definition
| Parameter name | Description |
|---|---|
| dataset | Specify the dataset's unique name in the workspace |
| label | Specifies how the dataset will appear in the Ready-to-explore Dataset |
| description | Add dataset description |
| owner | Define who should be in charge of managing the current dataset |
| data_source_name | Specify the database that Holistics will execute the generated query against (in the dataset) |
| relationships | Specify relationship and their configuration among added models |
| models | Specify which models will be used in the dataset |
| view | Define how models and fields are displayed in Preview / Dataset Exploration |
| dimension | Define cross-model dimensions in the dataset |
| metric | Define metrics to be used in the dataset |
| context | Configure analysis interactions for the dataset, including breakdown dimension lists and underlying data views |
| settings | Configure dataset-level settings, such as enabling or disabling analysis interactions |
| pre_aggregates | Define pre-aggregated tables for Aggregate Awareness, with built-in or external persistence |
| permission | Define row-level permission rules to filter data based on user attributes. (Coming soon) |
Dataset syntax examples
Core: metadata, models, and relationships
Every dataset starts with metadata and declares which models and relationships to include:
Dataset ecommerce {
label: '[Demo] Ecommerce'
description: 'Demo dataset for E-commerce use cases'
owner: '[email protected]'
data_source_name: 'demodb'
models: [
ecommerce_orders,
ecommerce_order_items,
ecommerce_users,
ecommerce_products,
ecommerce_categories
]
relationships: [
relationship(ecommerce_orders.user_id > ecommerce_users.id, true),
relationship(ecommerce_order_items.order_id > ecommerce_orders.id, true),
relationship(ecommerce_order_items.product_id > ecommerce_products.id, true),
relationship(ecommerce_products.category_id > ecommerce_categories.id, true)
]
}
Dimensions and metrics
You can define cross-model dimensions and metrics directly in the dataset using AQL expressions. For full details, see Dataset Fields.
Dataset ecommerce {
// ... models and relationships omitted
// Cross-model dimension
dimension full_name {
model: ecommerce_users
type: 'text'
label: 'Full Name'
definition: @aql concat(ecommerce_users.first_name, ' ', ecommerce_users.last_name);;
}
// Simple aggregation
metric count_orders {
label: 'Count Orders'
type: 'number'
definition: @aql count(ecommerce_orders.id) ;;
}
// Cross-model aggregation
metric sum_order_value {
label: 'Sum Order Values'
type: 'number'
definition: @aql sum(ecommerce_order_items, ecommerce_order_items.quantity * ecommerce_products.price) ;;
}
// Derived metric referencing other metrics
metric average_order_value {
label: 'Average Order Value'
type: 'number'
definition: @aql sum_order_value / count_orders;;
}
}
View, context, and settings
Use view to organize how models and fields appear in the exploration UI. Use context to configure drill-down and break-down dimension lists and underlying data views. Use settings to enable or disable analysis interactions.
Dataset ecommerce {
// ... models, relationships, dimensions, metrics omitted
// Organize the exploration UI
view {
model ecommerce_orders { }
model ecommerce_users { }
group relevant_models {
model ecommerce_products { }
model ecommerce_categories { }
}
group business_metrics {
metric sum_order_value
metric average_order_value
}
}
// Configure analysis interactions
context {
analysis {
// Breakdown dimension groups for drill-down
breakdown {
group location {
label: 'Locations'
fields: [
r(ecommerce_users.country),
r(ecommerce_users.city),
]
}
group product {
label: 'Products'
fields: [
r(ecommerce_products.category),
r(ecommerce_products.name),
]
}
}
// Underlying data views
underlying_data {
metric count_orders {
view list_of_orders {
label: 'List of Orders'
fields: [
r(ecommerce_orders.id),
r(ecommerce_orders.created_date),
r(ecommerce_orders.status),
r(ecommerce_users.full_name),
]
}
}
}
}
}
// Dataset-level settings
settings {
analysis_interactions {
breakdown {
enabled: true
}
view_underlying_data {
enabled: true
}
}
}
// Pre-aggregated table for Aggregate Awareness
pre_aggregates: {
agg_orders: PreAggregate {
dimension created_at_day {
for: r(ecommerce_orders.created_at)
time_granularity: "day"
}
dimension status {
for: r(ecommerce_orders.status)
}
measure count_orders {
for: r(ecommerce_orders.id)
aggregation_type: 'count'
}
persistence: FullPersistence {
schema: 'persisted'
}
}
}
// Row-level permission (coming soon)
permission regional_access {
field: r(ecommerce_orders.region)
operator: 'matches_user_attribute'
value: 'region'
}
}