Skip to main content

Integrate dbt with Holistics

info

This feature is only available on Holistics 4.0. Please reach out if you want early access to Holistics 4.0.

This feature is only available in Holistics 4.0 (Holistics Analytics As-Code).

dbt-to-holistics

Introduction​

dbt is a popular data transformation tool that data teams use to pre-transform data inside data warehouses before pushing to the BI layer. However, a common problem data teams face is: dbt and BI tools don't talk to each other. This leads to problems like:

  1. Disconnected metadata at BI layer:Β The table fields’ descriptions defined in dbt are not exposed to business users in the BI interface.
  2. Discontinuous Data flow trigger (Stale data in BI reports when data refreshes):Β when underlying data tables get rebuilt by dbt, the BI reports might still be using the cached, stale version.

Holistics supports "dbt integration" that solves the above 2 problems.

Use Cases & Benefits:

  • Metadata sync: When you change metadata in dbt model (e.g descriptions of columns), you can push those metadata to Holistics to display at the BI layer.
  • Exposing dbt metadata to business users: Business users can get access to schema metadata that data teams define in dbt docs.
  • Single code repository for analytics logic: You can maintain a single Github repository with both the Transformation layer (dbt) and BI layer (Holistics AML).
  • Continuous flow trigger: When dbt runs and underlying table data is updated, trigger will inform Holistics. Holistics can refresh data in relevant reports using that model.

Notes:

  • Currently, Holistics dbt integration only supports "connected metadata" functionality. The "data flow trigger" is not currently supported in the current version.
  • This guide assumes you are familiar with dbt and its concepts.

High-level Concept​

How this works on the high-level:

  • dbt is run and generates a manifest.json file that contains metadata related to the dbt project.
  • Users can push dbt's manifest.json file to Holistics using Holistics CLI
  • Holistics links metadata to the corresponding Holistics data models with the same underlying data tables generated by dbt.

Some of the metadata from dbt that can be presented in Holistics includes Tables description, columns description, dbt's lineage diagrams.

dbt-high-level-mechanism

Setting up​

Step 1: Install & Configure Holistics CLI​

Currently, Holistics only support pushing dbt metadata to Holistics via CLI.

Please follow this doc to set up the CLI for Holistics. Make sure you generate an API Access Key and authenticate the CLI with that key.

Step 2: Pushing manifest.json file to Holistics​

manifest.json is a JSON file generated by dbt build process. The file contains all metadata related to your dbt project.

Run your dbt build/compile to generate manifest.json file:

$ dbt compile
# or `dbt run`

Once you have the latest manifest file generated, run:

$ holistics dbt upload --file-path manifest.json --data-source=your_ds_name

The above command requires 2 params:

  • --file-path: path to your manifest.json file
  • --data-source: your data source name in Holistics

Each manifest file will be linked to a specific data source in Holistics.

If you have successfully pushed the file manifest.json to Holistics, you can go to our admin panel to check the status of your manifest.json file.

dbt-admin

Step 3: Create the table models in Holistics UI​

At this stage, all of your metadata from dbt has been propagated into Holistics. Table Models that share the same data tables with dbt will automatically display the metadata (defined in dbt docs)

If you don't have any models from dbt tables yet, you can create the table model as normal. Holistics will auto-detect if that table is linked with a dbt model.

Consider the following users_summary Holistics model that sits on top of users_summary table (created by dbt)

Model users_summary {
data_source: ''
table: 'bi.users_summary'

dimension id {
type: 'number'
}
dimension username {
type: 'text'
}
dimension full_name {
type: 'text'
}
}

When viewed in Holistics UI, the descriptions of the fields are automatically pulled in from dbt metadata

Center panel

Overriding field's description​

If you want to set a custom description and override the ones defined in dbt, just set the value of description attribute of the field.

Model users_summary {
table: 'bi.users_summary'
...

dimension full_name {
...
description: 'Full name of the user' # This will override the value from dbt
}
}

dbt-meta-overridden

FAQs​

If I update my model's metadata in Holistics, will it propagate back to dbt docs?​

No. The direction is only one way (from dbt docs β†’ Holistics).

What is the relationship between the dbt model and Holistics' table model for the same underlying database table?​

No direct relationship. Remember that there are 2 different concepts:

  • dbt model: A SQL query that gets persisted into a database table.
  • Holistics' Table Model: An abstraction on top of an existing database table.

I have updated the dbt's metadata, will they automatically sync with Holistics?​

No. Currently, we don't support this yet.

If you have updated your dbt's metadata, you will need to re-push your latest manifest.json to Holistics. Once done, the dbt's metadata in Holistics is automatically updated.

We recommend that you should set up a re-sync strategy from your side so the file manifest.json is automatically pushed to Holistics on a regular (or trigger) basis.

I am using dbt Cloud, how can I integrate with Holistics?​

dbt Cloud is unfortunately not supported at the moment. For now, you need to set up dbt CLI to integrate with Holistics.