Hypertables

Hypertables are usually used to ingest time-series data. They are a high-performance version of regular Postgres tables focussed on time-based bucketting, chunking, and partitioning.

Hypertables make less sense as dbt models to store transformed data. However, you can still use them as such. A more useful version right now is the empty option, which will create empty hypertables.

Look into virtual hypertables

If you're looking to use dbt to configure your leverage pre-existing hypertables, check out the virtual hypertables guide.

Only run hypertable models once

dbt will always recreate your entire model. This means that all existing data in your hypertables will be lost when you run them again. If you're using hypertables for ingesting time-series data, you probably don't want this.

Info

Consult the Timescale docs for more information regarding hypertables.

Usage

To materialize a model as a hypertable, simply set its materialization in the config to hypertable. Every hypertable also requires you to set the name of the time column.

SQLYAML

models/my_hypertable.sql

{{
  config(
    materialized='hypertable',
    main_dimension='time_column'
  )
}}
select current_timestamp as time_column

dbt_project.yml

models:
  your_project_name:
    folder_containing_the_hypertables:
      +materialized: hypertable
        model_one:
          +main_dimension: time_column  # (1)!
        model_two:
          +main_dimension: time_column_in_model_two
# ...

While you can set the hypertable materialization for multiple models, you'll still have to configure the main_dimension for each model individually.

Configuration options

dbt-specific options

The following options are not taken from the TimescaleDB APIs, but are specific to this adapter.

empty_hypertable: If set to true, the hypertable will be truncated right after creation (as a regular table) and right before converting it into a hypertable. Defaults to false.

TimescaleDB hypertable options

The TimescaleDB option create_default_indexes can be set to true or false. It defaults to true.

Dimensions

Hypertables have one or more dimensions, defined upon creation of the hypertable. The main dimension of a hypertable is provided using the main_dimension configuration option. Additional dimensions can be added to the hypertable using the dimensions configuration option.

In this adapter, dimensions can be provided as a dictionary with the following options:

column_name
type: by_hash or by_range (default is by_range)
partition_interval (only for by_range)
number_partitions (only for by_hash)
partitioning_func

Since most dimensions will probably be by_range dimensions with a column name, you can also provide the name of the column as a shorthand instead of a dictionary.

Empty hypertable required

You can only add dimensions to an empty hypertable.

Info

Consult the Timescale docs for more information regarding adding dimensions or the documentation on dimension builders.

SQLYAML

models/my_hypertable.sql

{{ config(
    materialized = 'hypertable',
    main_dimension = 'time_column',
    dimensions=[
      {"column_name": "id", "type": "by_hash", "number_partitions": 5},
      {"column_name": "col_1", "type": "by_range", "partition_interval": "interval '1 day'"},
      {"column_name": "another_column", "type": "by_range"}
    ]
}}

select
  current_timestamp as time_column,
  1 as id,
  2 as col_1,
  3 as another_column

dbt_project.yml

models:
  your_project_name:
    model_name:
      +materialized: hypertable
      +main_dimension:
        column_name: time_column
        type: by_range
      # the above would be equivalent to +main_dimension: time_column
      +dimensions:
        - column_name: id
          type: by_hash
          number_partitions: 5
        - column_name: another_time_column
          type: by_range
          partition_interval: interval '1 day'
# ...

integer_now_func

The following 2 options are available for (virtual) hypertables where the time column is not a timestamp:

integer_now_func (string): name of a function to be used to generate the current time as an integer.
integer_now_func_sql (string, optional): SQL code for the function mentioned above. If provided, the function with the name set in integer_now_func will be created. If not provided, an error will be thrown if the function does not exist already.

Use a macro

You could also call a macro for your integer_now_func_sql.

Idempotent

The integer_now_func_sql is idempotent and will replace an existing function if a function with the given name already exists. So while it may cause some overhead during the dbt run, it doesn't matter if you share this config across multiple models.

The name is enough

You don't have to provide the SQL code for the function if you already have a function with the name set in integer_now_func in your database. You could create the function once in a single model or with dbt run-operation and then reuse it in all other models.

Info

Consult the Timescale docs for more information regarding this functionality.

models/my_hypertable.sql

{{
  config(
    materialized='hypertable',
    main_dimension='time_column',
    integer_now_func='my_hypertable_int_to_now',
    integer_now_func_sql='select extract(epoch from now())::bigint'
  )
}}
select 1::bigint as time_column

chunk_time_interval

The chunk_time_interval config option allows you to set the interval at which TimescaleDB will chunk your (virtual) hypertable. This is useful for optimizing query performance and storage efficiency. The default value is 1 week.

Note that the type of the interval depends on the type of your time column and has to match.

```sql+jinja title="models/my_hypertable.sql" {{ config( materialized='hypertable', main_dimension='time_column', chunk_time_interval="interval '1 day'" ) }}