BigQuery: Working with Date-sharded Tables

Question

I'm connecting Holistics with Firebase through BigQuery. Firebase created one table every days:

  • events_20201101
  • events_20201102
  • events_20201103
  • ...

How do I get this to work with Holistics' Data Modeling? Do I need to create multiple models or can I just create one "master model"?

Answer

Your table is actually a "date-sharded table" in BigQuery. In other words, your event table is automatically broken into multiple individual tables that hold event data of exactly one day.

At the moment, Holistics have not supported creating data model on tables with sharded structures.

What you can do is: create a Transform Model that query directly from your event_YYYYMMDD table, and aggregate the events into your desired level.

For example

#standardsql
select
user_id
, event_created_date
, count(event_1) as event_1_counts
, count(event_2) as event_2_counts
from `events_*`
where _TABLE_SUFFIX between '20200101' and '20200301'

You can read more about querying wildcard tables here.

Please note that in order to use wildcard tables in BigQuery's StandardSQL, you need to enclose the table name in the backticks like this: event_*