Skip to main content

pre-hook & post-hook

In these examples, we use the | symbol to separate two different formatting options for SQL statements in pre-hooks and post-hooks. The first option (without brackets) accepts a single SQL statement as a string, while the second (with brackets) accepts multiple SQL statements as an array of strings. Replace SQL-STATEMENT with your SQL.

dbt_project.yml

models:
<resource-path>:
+pre-hook: SQL-statement | [SQL-statement]
+post-hook: SQL-statement | [SQL-statement]

models/<model_name>.sql

{{ config(
pre_hook="SQL-statement" | ["SQL-statement"],
post_hook="SQL-statement" | ["SQL-statement"],
) }}

select ...

models/properties.yml
models:
- name: [<model_name>]
config:
pre_hook: <sql-statement> | [<sql-statement>]
post_hook: <sql-statement> | [<sql-statement>]

Definition

A SQL statement (or list of SQL statements) to be run before or after a model, seed, or snapshot is built.

Pre- and post-hooks can also call macros that return SQL statements. If your macro depends on values available only at execution time, such as using model configurations or ref() calls to other resources as inputs, you will need to wrap your macro call in an extra set of curly braces.

Why would I use hooks?

dbt aims to provide all the boilerplate SQL you need (DDL, DML, and DCL) via out-of-the-box functionality, which you can configure quickly and concisely. In some cases, there may be SQL that you want or need to run, specific to functionality in your data platform, which dbt does not (yet) offer as a built-in feature. In those cases, you can write the exact SQL you need, using dbt's compilation context, and pass it into a pre- or post- hook to run before or after your model, seed, or snapshot.

Examples

In older versions of dbt, the most common use of post-hook was to execute grant statements, to apply database permissions to models right after creating them. We recommend using the grants resource config instead, in order to automatically apply grants when your dbt model runs.

[Redshift] Unload one model to S3

model.sql
{{ config(
post_hook = "unload ('select from {{ this }}') to 's3:/bucket_name/{{ this }}"
) }}

select ...

See: Redshift docs on UNLOAD

[Apache Spark] Analyze tables after creation

dbt_project.yml

models:
jaffle_shop: # this is the project name
marts:
finance:
+post-hook:
# this can be a list
- "analyze table {{ this }} compute statistics for all columns"
# or call a macro instead
- "{{ analyze_table() }}"

See: Apache Spark docs on ANALYZE TABLE

Additional examples

We've compiled some more in-depth examples here.

Usage notes

Hooks are cumulative

If you define hooks in both your dbt_project.yml and in the config block of a model, both sets of hooks will be applied to your model.

Execution ordering

If multiple instances of any hooks are defined, dbt will run each hook using the following ordering:

  1. Hooks from dependent packages will be run before hooks in the active package.
  2. Hooks defined within the model itself will be run after hooks defined in dbt_project.yml.
  3. Hooks within a given context will be run in the order in which they are defined.

Transaction behavior

If you're using an adapter that uses transactions (namely Postgres or Redshift), it's worth noting that by default hooks are executed inside of the same transaction as your model being created.

There may be occasions where you need to run these hooks outside of a transaction, for example:

  • You want to run a VACUUM in a post-hook, however, this cannot be executed within a transaction (Redshift docs)
  • You want to insert a record into an audit tableIn simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. at the start of a run and do not want that statement rolled back if the model creation fails.

To achieve this behavior, you can use one of the following syntaxes:

  • Important note: Do not use this syntax if you are using a database where dbt does not support transactions. This includes databases like Snowflake, BigQuery, and Spark or Databricks.

Config block: use the before_begin and after_commit helper macros

models/<modelname>.sql
{{
config(
pre_hook=before_begin("SQL-statement"),
post_hook=after_commit("SQL-statement")
)
}}

select ...

0