Skip to main content

Enable Canal Engine

Knowledge Checkpoint

To learn about Holistics Canal, please head to this doc.

Supported databases

DatabaseStatusSupported versionConnection poolingParallel download
PostgreSQL✅ StablePostgreSQL >= 12 (not end-of-life)✅ Yes❌ No
Redshift✅ Stable✅ Yes❌ No
BigQuery✅ Stable✅ Yes❌ No
Snowflake✅ Stable✅ Yes✅ Yes
MySQL✅ StableMySQL 8.0, 8.4, 9.1 (not end-of-life)✅ Yes❌ No
Amazon Athena✅ Stable✅ Yes❌ No
Microsoft SQL Server✅ StableSQL Server 2016 or later (not end-of-life)✅ Yes❌ No
ClickHouse🆕 BetaSee Supported Versions✅ Yes❌ No
MotherDuck🆕 Beta✅ Yes❌ No
Databricks🆕 Beta✅ Yes✅ Yes
Presto🚧 To be supported

Connection pooling means Holistics will try to re-use the same connection for multiple queries. Learn more

Parallel download allows some databases to download query results in multiple parts simultaneously. This typically only makes a difference when your result set has more than 5,000 rows.

How to enable Holistics Canal

Enabled by default

Holistics Canal is enabled by default on stable databases.
Check out Supported Databases to learn more about database support status.

In your Data Source settings, open Advanced Holistics Settings, check Enable Holistics Canal (and then Save it):

performance-enable-holistics-canal

If you find any issues (typically with "Canal" in the error message), please reach out to Holistics Support.
In case it's urgent, please

  1. Make sure to screenshot the error first
  2. Disable Canal on your Data Source
  3. Report the error (with screenshot) to Holistics Support

Performance expectations

Canal typically speeds up Steps 4 and 5 of Holistics Report/Query Execution.

  • It reduces the time between when the query is finished and the time the result is delivered to the end-users. In other words, it reduces Holistics latency in query executions.
  • It does not make the Data Warehouse execute a query faster.
    • To optimize queries, please refer to this doc.

Below is an example comparison taken from our Community Post. Note that the query itself takes < 300ms to run in the database.

performance-canal-demo

Database-specific notes

PostgreSQL and Redshift

  • Holistics Canal might still work with PostgreSQL < 12, but it is not guaranteed because PostgreSQL versions below 12 are end-of-life (ref).
  • See Number precision below.

BigQuery

MySQL

  • Holistics Canal might still work with older versions like MySQL 5.7, but it is not guaranteed because those versions are end-of-life (ref).
    • For example, MySQL 5.7 may typically have TLS/SSL issues.
  • See Number precision below.

SQL Server

  • Holistics Canal might still work with older versions like SQL Server 2014, but it is not guaranteed because those versions are end-of-life (ref)

ClickHouse

  • Holistics Canal connects to ClickHouse using its native protocol, which is accessible at port 9000 (without SSL/TLS) or port 9440 (with SSL/TLS) by default.
  • Holistics Canal does not support using FORMAT clause. Using FORMAT clause in your queries would result in errors.

MotherDuck

  • MotherDuck Data Sources always use Canal.
  • Unsupported data types: UNION, BIT, VARINT, UHUGEINT
  • See Number precision below.

Databricks

  • Holistics uses Databricks Cloud Fetch to enable high-bandwidth data transfer (aka. the Parallel Download feature).

Other databases

If you are using another database that is supported by Holistics but not yet supported by Holistics Canal, please let us know via this form.

Data type handling

Canal aims to preserve data precision while keeping it readable and efficient. Here are some special considerations:

Number precision

  • Floats: Canal supports upto 64-bit floating-point numbers.
  • Decimals: For the "decimal" or "numeric" data types that have a fixed/arbitrary precision (non-floating precision), Canal supports upto decimal(38, 9).
    • For PostgreSQL and Redshift:
      • This means that a decimal number can only have upto 38 digits, including 29 whole digits and 9 fractional digits.
        • Numbers that have more than 29 whole digits will raise an error in Canal.
        • Numbers that have more than 9 fractional digits will be rounded to 9 fractional digits.
    • For BigQuery:
      • BIGDECIMAL and BIGNUMERIC data are cached and post-processed as double (aka. float64), meaning it can still be computed on your Data Warehouse and displayed to end-users, but with floating-point precision instead of fixed precision.
    • For MySQL and ClickHouse:
      • Decimal/Numeric data with precision > 38 is cached and post-processed as double (aka. float64), meaning it can still be computed on your Data Warehouse and displayed to end-users, but with floating-point precision instead of fixed precision.
    • Workarounds for decimals that are larger than decimal(38, 9):
      • Truncate/round to make them smaller: For example, instead of processing in meters, convert your data into kilometers. Canal only operates on the result of your data, so you can still perform big computations first, then convert the result to smaller numbers later before returning the query result.
      • Convert into text -> cannot apply numeric calculations but can still display them.
      • Convert into double (i.e. 64-bit float) -> lose fixed precision but can still display and apply numeric calculations with floating-point precision.
  • For MotherDuck:
    • HUGEINT data is cached and post-processed as text, meaning it can still be computed on your Data Warehouse and displayed to end-users, but Holistics won't be able to apply visual-level calculations to it.

Time precision

Holistics only processes Time and Timestamp/Datetime values up to microsecond precision. Values that have higher precisions will be rounded to microsecond precision.

We believe that microsecond precision is adequately precise and efficient for most analytical use cases. But please let us know if it does not suit your needs.


Open Markdown
Let us know what you think about this document :)