Skip to main content

dbt Integration

info

This feature is only available on Holistics 4.0 (Analytics As-Code). Please reach out if you want access to Holistics 4.0.

dbt-to-holistics

Introduction

dbt is a popular data transformation tool that data teams use to pre-transform data inside data warehouses before pushing to the BI layer. However, a common problem data teams face is: dbt and BI tools don't talk to each other. This leads to problems like:

  1. Disconnected metadata at BI layer: The table fields’ descriptions defined in dbt are not exposed to business users in the BI interface.
  2. Discontinuous data flow trigger (Stale data in BI reports when data refreshes): when underlying data tables get rebuilt by dbt, the BI reports might still be using the cached, stale version.

Holistics supports "dbt integration" that solves the above 2 problems.

How it works

This is a high-level description of the integration's mechanism:

  • dbt is run and generates a manifest.json file that contains metadata related to the dbt project.
  • Holistics will use the information in the manifest.json file to link the metadata to the corresponding Holistics data models with the same underlying data tables generated by dbt.

Some of the metadata from dbt that can be presented in Holistics includes tables description, columns description, dbt's lineage diagrams.

dbt-to-holistics

Use Cases & Benefits

  • Metadata sync: When you change metadata in dbt model (e.g descriptions of columns), you can push those metadata to Holistics to display at the BI layer.
  • Exposing dbt metadata to business users: Business users can get access to schema metadata that data teams define in dbt docs.
  • Single code repository for analytics logic: You can maintain a single Github repository with both the Transformation layer (dbt) and BI layer (Holistics AML).
  • Continuous flow trigger: When dbt runs and underlying table data is updated, trigger will inform Holistics. Holistics can refresh data in relevant reports using that model.
info

Currently, Holistics dbt integration only supports "connected metadata" functionality. The "data flow trigger" has not been supported.

Getting started

Holistics supports two ways to integrate with dbt. These are:

FAQs

If I update my model's metadata in Holistics, will it propagate back to dbt docs?

No. The direction is only one way (from dbt docs → Holistics).

What is the relationship between the dbt model and Holistics' table model for the same underlying database table?

No direct relationship. Remember that there are 2 different concepts:

  • dbt model: A SQL query that gets persisted into a database table.
  • Holistics' Table Model: An abstraction on top of an existing database table.

Let us know what you think about this document :)