Virtual hypertables
Hypertables are usually used to ingest time-series data. They are a high-performance version of regular Postgres tables focussed on time-based bucketting, chunking, and partitioning.
Hypertables by themselves don't make a lot of sense in dbt as you'd create them outside of dbt and then ingest data into them. With virtual hypertables, we can leverage pre-existing hypertables and use dbt to manage their configuration.
Info
Consult the Timescale docs for more information regarding hypertables.
Existing configurations
As soon as you start to manage a hypertable as a virtual hypertable with dbt-timescaledb, dbt will replace existing configurations on every run. This includes the retention policy, compression, and other settings. If you have existing configurations in place, you have to make sure to adjust the dbt configuration accordingly.
Usage
The hypertable has to pre-exist in your database. If the hypertable is not present, dbt will throw an error. Optionally, you can specify dbt's built-in schema
parameter to reference a hypertable in a different schema.
The SQL code in the dbt model does not matter and will be fully ignored. However, dbt will ignore empty models. You could just put --
in the model to make it non-empty.
{{
config(
materialized='virtual_hypertable'
)
}}
--
models:
your_project_name:
folder_containing_the_hypertables:
+materialized: virtual_hypertable
# ...
Configuration options
Dimensions are not supported for virtual hypertables as you can only set them during the creation of the hypertable.
You can use virtual hypertables to manage compression, indexes, set a reorder policy, define retention policies, or any of the options below.
integer_now_func
The following 2 options are available for (virtual) hypertables where the time column is not a timestamp:
integer_now_func
(string): name of a function to be used to generate the current time as an integer.integer_now_func_sql
(string, optional): SQL code for the function mentioned above. If provided, the function with the name set ininteger_now_func
will be created. If not provided, an error will be thrown if the function does not exist already.
Use a macro
You could also call a macro for your integer_now_func_sql
.
Idempotent
The integer_now_func_sql
is idempotent and will replace an existing function if a function with the given name already exists. So while it may cause some overhead during the dbt run, it doesn't matter if you share this config across multiple models.
The name is enough
You don't have to provide the SQL code for the function if you already have a function with the name set in integer_now_func
in your database. You could create the function once in a single model or with dbt run-operation
and then reuse it in all other models.
Info
Consult the Timescale docs for more information regarding this functionality.
{{
config(
materialized='hypertable',
main_dimension='time_column',
integer_now_func='my_hypertable_int_to_now',
integer_now_func_sql='select extract(epoch from now())::bigint'
)
}}
select 1::bigint as time_column
chunk_time_interval
The chunk_time_interval
config option allows you to set the interval at which TimescaleDB will chunk your (virtual) hypertable. This is useful for optimizing query performance and storage efficiency. The default value is 1 week
.
Note that the type of the interval depends on the type of your time column and has to match.
```sql+jinja title="models/my_hypertable.sql" {{ config( materialized='hypertable', main_dimension='time_column', chunk_time_interval="interval '1 day'" ) }}