Skip to main content

BigQuery: Working with Date-sharded Tables

Question

I'm connecting Holistics with Firebase through BigQuery. Firebase creates one table every day:

  • events_20201101
  • events_20201102
  • events_20201103
  • ...

How do I get this to work with Holistics' Data Modeling? Do I need to create multiple models or can I just create one "master model"?

Answer

Your table is actually a "date-sharded table" in BigQuery. In other words, your event table is automatically broken into multiple individual tables that hold event data of exactly one day.

At the moment, Holistics does not support creating data models on tables with sharded structures.

What you can do is: create a Query Model that query directly from your event_YYYYMMDD table, and aggregate the events into your desired level.

For example

#standardsql
select
user_id
, event_created_date
, count(event_1) as event_1_counts
, count(event_2) as event_2_counts
from `events_*`
where _TABLE_SUFFIX between '20200101' and '20200301'

You can read more about querying wildcard tables here.

Please note that in order to use wildcard tables in BigQuery's StandardSQL, you need to enclose the table name in the backticks like this: event_*


Let us know what you think about this document :)