Skip to main content

Consolidate data from different sources

To answer certain data questions for your organization, sometimes you need to pull data from different sources (CSV, Google Analytics, Google Spreadsheet or production data) to do analytics

At Holistics, Import Models are models created to load data from other sources (e.g. CSV, Google Analytics, etc.) to your current Data Warehouse. In this tutorial, we will walk you through simple steps to create Import Model and consolidate your Data

Import Data

To import data into your data warehouse, please follow these three steps:

1. Choose which Data Warehouse you want to import data into

At this step, the pre-requisite is that you have already connected to your Data Warehouse. Then, go to Data Modeling tab and select the Data Warehouse as the destination you will import other data into.

import

2. Create a new Model from Data Import

In your Data Warehouse, just click Create → Select Data Import, then select a source from the list we are supporting.

import data

As shown in the list, you can import data from your SQL database (PostgreSQL, BigQuery...), no-SQL database (MongoDB...), applications (Pipedrive, Google Analytics, Facebook Ads, Google Spreadsheet...) or a File (CSV).

If you cannot find your sources on our list, please contact us at [email protected].

3. Source Setup

Please note that the setup process might be slightly different between SQL, non-SQL databases and applications. To check the difference, please refer to our Integration document.

For example, if you want to load data from Google Spreadsheet, simply paste the Spreadsheet link in, click Validate URL and choose a table/sheet you want to load into your Data Warehouse.

import

You can also preview the structure of the destination table before proceeding.

At this step, you can edit the destination table's schema and name, as well as set Refresh Schedule and Import Mode. For more information about the Advanced Settings, please check our doc.

Please note that by default, all the columns in your Google Spreadsheet will be cast into TEXT/STRING type when loading to your databases. You can change the column type in Sync Configuration.

import

That is just an example of Google Spreadsheet integration, the mechanism of other sources like CSV, MongoDB or Pipedrive will be relatively similar.

For CSV files, just simply upload the file, view table structure and make proper adjustment to the Destination Settings and Sync Configuration.

import

Among them all, Google Analytics seems to have the most different initial setup step. Following Google Analytics standard and its in-depth guide, you need to define the metrics, dimensions to construct Google Analytics queries. After setting up these dimensions and metrics, the rest will be similar to other sources (view data table preview, adjust table destination and sync configuration)

import

Consolidate your sources

After successfully importing your external sources to your Data Warehouse, you can now combine them with other data models (SQL Model or Data Table) to enrich your organization's data.

import

For example, you have an ecommerce dataset that contains 5 models: users, orders, order_items, cities, and countries, but they are not in the same source (users, orders, and order_items are retrieved from Production Database, cities is recorded in a Spreadsheet life daily and countries comes from a CSV file).

Another use case is that you can aggregate orders count to the daily level, and combine it with an Import Model that pulls in GA's daily traffic numbers to see if your marketing attempts have any significant effects.

Now, after loading them all to your Data Warehouse, you can start modeling them, creating relationships properly for the data needs of your organization.


Let us know what you think about this document :)