Virtual hypertables

Hypertables are usually used to ingest time-series data. They are a high-performance version of regular Postgres tables focussed on time-based bucketting, chunking, and partitioning.

Hypertables by themselves don't make a lot of sense in dbt as you'd create them outside of dbt and then ingest data into them. With virtual hypertables, we can leverage pre-existing hypertables and use dbt to manage their configuration.

Info

Consult the Timescale docs for more information regarding hypertables.

Existing configurations

As soon as you start to manage a hypertable as a virtual hypertable with dbt-timescaledb, dbt will replace existing configurations on every run. This includes the retention policy, compression, and other settings. If you have existing configurations in place, you have to make sure to adjust the dbt configuration accordingly.

Usage

The hypertable has to pre-exist in your database. If the hypertable is not present, dbt will throw an error. Optionally, you can specify dbt's built-in schema parameter to reference a hypertable in a different schema.

The SQL code in the dbt model does not matter and will be fully ignored. However, dbt will ignore empty models. You could just put -- in the model to make it non-empty.

SQLYAML

models/existing_hypertable.sql

{{
  config(
    materialized='virtual_hypertable'
  )
}}
--

dbt_project.yml

models:
  your_project_name:
    folder_containing_the_hypertables:
      +materialized: virtual_hypertable
# ...

Configuration options

Dimensions are not supported for virtual hypertables as you can only set them during the creation of the hypertable.

You can use virtual hypertables to manage compression, indexes, set a reorder policy, define retention policies, or any of the options below.

integer_now_func

The following 2 options are available for (virtual) hypertables where the time column is not a timestamp:

integer_now_func (string): name of a function to be used to generate the current time as an integer.
integer_now_func_sql (string, optional): SQL code for the function mentioned above. If provided, the function with the name set in integer_now_func will be created. If not provided, an error will be thrown if the function does not exist already.

Use a macro

You could also call a macro for your integer_now_func_sql.

Idempotent

The integer_now_func_sql is idempotent and will replace an existing function if a function with the given name already exists. So while it may cause some overhead during the dbt run, it doesn't matter if you share this config across multiple models.

The name is enough

You don't have to provide the SQL code for the function if you already have a function with the name set in integer_now_func in your database. You could create the function once in a single model or with dbt run-operation and then reuse it in all other models.

Info

Consult the Timescale docs for more information regarding this functionality.

models/my_hypertable.sql

{{
  config(
    materialized='hypertable',
    main_dimension='time_column',
    integer_now_func='my_hypertable_int_to_now',
    integer_now_func_sql='select extract(epoch from now())::bigint'
  )
}}
select 1::bigint as time_column

chunk_time_interval

The chunk_time_interval config option allows you to set the interval at which TimescaleDB will chunk your (virtual) hypertable. This is useful for optimizing query performance and storage efficiency. The default value is 1 week.

Note that the type of the interval depends on the type of your time column and has to match.

```sql+jinja title="models/my_hypertable.sql" {{ config( materialized='hypertable', main_dimension='time_column', chunk_time_interval="interval '1 day'" ) }}