DataHub integration

Early access

This feature is currently in development and not yet available. Contact Holistics to sign up for early access.

Introduction

DataHub is an open-source metadata platform that helps organizations discover, understand, and govern their data assets. By integrating Holistics with DataHub, you can catalog your BI layer alongside your data warehouse tables, creating a unified view of your entire data stack.

This integration automatically ingests Holistics models, datasets, dashboards, and charts into DataHub, along with the lineage relationships between them. This enables data teams to answer questions like "which dashboards will be affected if I change this database table?" or "where does this metric come from?"

The connector now consumes the canonical output of holistics aml lineage. That canonical output is a graph of AML-native concepts such as models, datasets, dashboards, viz blocks, source tables, and typed edges between them. The connector then maps the subset relevant to DataHub into DataHub entities.

How it works

The integration uses a git-based approach similar to DataHub's LookML connector. Since Holistics is fully as-code with all assets defined in AML files, the connector reads your AML project directly without needing API access.

Here's how the ingestion process works:

Get your AML project - Either clone from a git repository or use a local directory
Run the Holistics CLI - The holistics aml lineage command compiles your AML files and outputs a canonical lineage graph
Parse and transform - The connector parses the canonical nodes and edges and maps Holistics entities to DataHub entities
Emit to DataHub - Metadata is pushed to your DataHub instance via the standard ingestion framework

This approach ensures that the metadata in DataHub always reflects what's defined in your AML code, while keeping the CLI output aligned to AML concepts instead of a DataHub-specific schema.

What gets synced

The connector maps Holistics concepts to DataHub entities as follows:

Holistics Concept	AML File	DataHub Entity	Subtype
Model	`.model.aml`	Dataset	View
Dataset	`.dataset.aml`	Dataset	Explore
Dashboard	`.page.aml`	Dashboard	-
VizBlock (chart)	(within page)	Chart	-
Dimension	(field in model)	SchemaField	Tagged `holistics:dimension`
Measure	(field in model)	SchemaField	Tagged `holistics:measure`

For each model, the connector extracts schema information including field names, types, descriptions, and whether fields are dimensions or measures. This metadata appears in DataHub's schema tab, helping users understand the semantic layer without leaving the data catalog.

Lineage

One of the most valuable aspects of this integration is automatic lineage extraction. The connector builds lineage at multiple levels:

Dashboard to Charts - Each chart is linked to its parent dashboard
Charts to Models - Charts reference the specific model fields they visualize
Datasets to Models - Datasets are linked to all the models they include
Models to Source Tables - Table models are connected to their underlying database tables

The canonical lineage graph may also contain additional AML concepts, such as non-viz dashboard blocks or filter-block lineage. The DataHub connector intentionally ignores concepts that do not map to current DataHub entities, rather than requiring the CLI to omit them.

This multi-level lineage enables powerful impact analysis. When someone wants to modify a database table, they can trace through DataHub to see exactly which Holistics models, datasets, and dashboards depend on it.

Connection mapping required

To establish lineage from Holistics models to your source database tables, you need to configure connection mapping. This tells the connector how to translate Holistics data source names (like bigquery_prod) to DataHub platform identifiers. See the setup guide for details.

Getting started

Ready to set up the integration? Head to the setup guide for step-by-step instructions on installing the connector and configuring your first ingestion.

Introduction​

How it works​

What gets synced​

Lineage​

Getting started​

Introduction

How it works

What gets synced

Lineage

Getting started