pre-hook & post-hook
- Models
- Seeds
- Snapshots
In these examples, we use the |
symbol to separate two different formatting options for SQL statements in pre-hooks and post-hooks. The first option (without brackets) accepts a single SQL statement as a string, while the second (with brackets) accepts multiple SQL statements as an array of strings. Replace SQL-STATEMENT
with your SQL.
models:
<resource-path>:
+pre-hook: SQL-statement | [SQL-statement]
+post-hook: SQL-statement | [SQL-statement]
{{ config(
pre_hook="SQL-statement" | ["SQL-statement"],
post_hook="SQL-statement" | ["SQL-statement"],
) }}
select ...
In these examples, we use the |
symbol to separate two different formatting options for SQL statements in pre-hooks and post-hooks. The first option (without brackets) accepts a single SQL statement as a string, while the second (with brackets) accepts multiple SQL statements as an array of strings. Replace SQL-STATEMENT
with your SQL.
seeds:
<resource-path>:
+pre-hook: SQL-statement | [SQL-statement]
+post-hook: SQL-statement | [SQL-statement]
In these examples, we use the |
symbol to separate two different formatting options for SQL statements in pre-hooks and post-hooks. The first option (without brackets) accepts a single SQL statement as a string, while the second (with brackets) accepts multiple SQL statements as an array of strings. Replace SQL-STATEMENT
with your SQL.
snapshots:
<resource-path>:
+pre-hook: SQL-statement | [SQL-statement]
+post-hook: SQL-statement | [SQL-statement]
{% snapshot snapshot_name %}
{{ config(
pre_hook="SQL-statement" | ["SQL-statement"],
post_hook="SQL-statement" | ["SQL-statement"],
) }}
select ...
{% end_snapshot %}
Definition
A SQL statement (or list of SQL statements) to be run before or after a model, seed, or snapshot is built.
Pre- and post-hooks can also call macros that return SQL statements. If your macro depends on values available only at execution time, such as using model configurations or ref()
calls to other resources as inputs, you will need to wrap your macro call in an extra set of curly braces.
Why would I use hooks?
dbt aims to provide all the boilerplate SQL you need (DDL, DML, and DCL) via out-of-the-box functionality, which you can configure quickly and concisely. In some cases, there may be SQL that you want or need to run, specific to functionality in your data platform, which dbt does not (yet) offer as a built-in feature. In those cases, you can write the exact SQL you need, using dbt's compilation context, and pass it into a pre-
or post-
hook to run before or after your model, seed, or snapshot.
Examples
In older versions of dbt, the most common use of post-hook
was to execute grant
statements, to apply database permissions to models right after creating them. We recommend using the grants
resource config instead, in order to automatically apply grants when your dbt model runs.
[Redshift] Unload one model to S3
{{ config(
post_hook = "unload ('select from {{ this }}') to 's3:/bucket_name/{{ this }}"
) }}
select ...
[Apache Spark] Analyze tables after creation
models:
jaffle_shop: # this is the project name
marts:
finance:
+post-hook:
# this can be a list
- "analyze table {{ this }} compute statistics for all columns"
# or call a macro instead
- "{{ analyze_table() }}"
Additional examples
We've compiled some more in-depth examples here.
Usage notes
Hooks are cumulative
If you define hooks in both your dbt_project.yml
and in the config
block of a model, both sets of hooks will be applied to your model.
Execution ordering
If multiple instances of any hooks are defined, dbt will run each hook using the following ordering:
- Hooks from dependent packages will be run before hooks in the active package.
- Hooks defined within the model itself will be run after hooks defined in
dbt_project.yml
. - Hooks within a given context will be run in the order in which they are defined.
Transaction behavior
If you're using an adapter that uses transactions (namely Postgres or Redshift), it's worth noting that by default hooks are executed inside of the same transaction as your model being created.
There may be occasions where you need to run these hooks outside of a transaction, for example:
- You want to run a
VACUUM
in apost-hook
, however, this cannot be executed within a transaction (Redshift docs) - You want to insert a record into an audit tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. at the start of a run and do not want that statement rolled back if the model creation fails.
To achieve this behavior, you can use one of the following syntaxes:
- Important note: Do not use this syntax if you are using a database where dbt does not support transactions. This includes databases like Snowflake, BigQuery, and Spark or Databricks.
- Use before_begin and after_commit
- Use a dictionary
- Use dbt_project.yml
Config block: use the before_begin
and after_commit
helper macros
{{
config(
pre_hook=before_begin("SQL-statement"),
post_hook=after_commit("SQL-statement")
)
}}
select ...
Config block: use a dictionary
{{
config(
pre_hook={
"sql": "SQL-statement",
"transaction": False
},
post_hook={
"sql": "SQL-statement",
"transaction": False
}
)
}}
select ...
dbt_project.yml
: Use a dictionary
models:
+pre-hook:
sql: "SQL-statement"
transaction: false
+post-hook:
sql: "SQL-statement"
transaction: false