Frequently Asked Questions#
Can other tools perform the same checks as dbt-bouncer?#
There are several other tools that perform similar tasks as dbt-bouncer.
- dbt-checkpoint: A collection of
pre-commithooks for dbt projects. Tests are written in python. Configuration is performed via.pre-commit-config.yaml. Provided the dbt artifacts have already been generated,dbt-checkpointdoes not need access to the underlying database. The hooks execute when a new commit is made, as suchdbt-checkpointis designed to be run only as part ofpre-commit. - dbt-project-evaluator: This is a dbt package from dbt Labs. Tests are written in
.sqlfiles using a combination of Jinja and SQL. Configuration is performed viadbt_project.ymland seed files (i.e. csv files). Requires a connection to underlying database. Designed to be run both in a CI pipeline and also during active development. - dbt-score: This is a python package installable via
pip. A collection of tests that apply only to dbt models. Tests can be executed from the command line. Tests are written in python. Configuration is performed via apyproject.tomlfile. Provided the dbt artifacts have already been generated,dbt-scoredoes not need access to the underlying database. Designed to be run during development.
While the above tools inhabit the same space as dbt-bouncer they do not provide what we consider to be the optimum experience that dbt-bouncer provides:
- Designed to run both locally and in a CI pipeline.
- Configurable via a file format,
YML, that dbt developers are already familiar with. - Does not require database access.
- Can run tests against any of dbt's artifacts.
- Allows tests to be written in python.
As such we consider dbt-bouncer to be the best tool to enforce conventions in a dbt project.
Tip
dbt-bouncer can perform all the tests currently included in dbt-checkpoint, dbt-project-evaluator and dbt-score. If you see an existing test that is not possible with dbt-bouncer, open an issue and we'll add it!
Does dbt-bouncer work with dbt Cloud?#
Yes! As dbt-bouncer runs on the artifacts generated by dbt, it can be used with dbt Cloud as long as the artifacts generated by the CI job in dbt Cloud are available.
For GitHub this can be achieved using the pgoslatara/dbt-cloud-download-artifacts-action action:
name: CI pipeline
on:
pull_request:
branches:
- main
jobs:
download-artifacts:
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Download dbt artifacts
uses: pgoslatara/dbt-cloud-download-artifacts-action@v1
with:
commit-sha: ${{ github.event.pull_request.head.sha }}
dbt-cloud-api-token: ${{ secrets.DBT_CLOUD_API_TOKEN }}
- name: Run dbt-bouncer
uses: godatadriven/dbt-bouncer@vX.X
Warning
dbt Cloud now supports a "versionless" option, which allows dbt projects to be run with the latest version of dbt. One effect of choosing this option is that dbt artifacts may receive non-breaking changes (source), these may or may not be compatible with dbt-bouncer. If you encounter a bug as a result of this, please open an issue and we'll investigate.
How to configure dbt-bouncer for use in a CI pipeline?#
dbt-bouncer is designed to be use primarily in a CI pipeline such as GitHub Actions or Azure DevOps. To do this we create a config file such as:
catalog_checks:
- name: check_column_description_populated
include: ^models/marts
manifest_checks:
- name: check_model_directories
include: ^models
permitted_sub_directories:
- intermediate
- marts
- staging
- utilities
run_results_checks:
- name: check_run_results_max_execution_time
max_execution_time_seconds: 10
The goal of a CI pipeline is to test the changes in a pull request but also to provide feedback to the developer as quickly as possible without incurring unnecessary costs (time, financial, compute, etc.). To achieve this we can combine several features of dbt and dbt-bouncer:
-
By running
dbt parse, dbt can generate amanifest.jsonwithout a database connection. We can then run our manifest checks via: -
dbt requires models to be materialised before it can generate a
catalog.jsonfile. By runningdbt run --emptywe can materialise every model without processing any data. Once these materialisations are performed we can run our catalog checks via: -
Typically a CI pipeline will run a
dbt buildcommand with flags such as--stateand/or--defer. After this command has completed we can run our run results checks via:
By using this approach, and combining with your own unique constraints and desires, dbt-bouncer can be used efficiently as part of your CI pipeline.
How to set up dbt-bouncer in a monorepo?#
A monorepo may consist of one directory with a dbt project and other directories with unrelated code. It may be desired for dbt-bouncer to be configured from the root directory. Sample directory tree:
.
├── dbt-bouncer.yml
├── README.md
├── dbt-project
│ ├── models
│ ├── dbt_project.yml
│ └── profiles.yml
└── package-a
├── src
├── tests
└── package.json
To ease configuration you can use exclude or include at the global level (see Config File for more details). For the above example dbt-bouncer.yml could be configured as:
dbt_artifacts_dir: dbt-project/target
include: ^dbt-project
manifest_checks:
- name: check_exposure_based_on_non_public_models
dbt-bouncer can now be run from the root directory.
How to set up dbt-bouncer in a dbt Mesh?#
A dbt Mesh is a collection of dbt projects in an organisation, some of which can read models from other dbt projects. Natively supported by dbt Cloud, a dbt Mesh can also be set up with dbt Core using a plugin such as dbt-loom.
One challenge in a dbt Mesh is the large number of developers working across multiple dbt projects leading to differing conventions being implemented. There are multiple approaches to using dbt-bouncer in a dbt Mesh, two are outlined below.
Approach 1: Individual dbt-bouncer.yml configuration file#
Each dbt project can have its own dbt-bouncer.yml configuration file. This allows each project to adopt and implement its own conventions in addition to any conventions to be shared across all dbt projects. Should a breaking change be required to the config file then each dbt project can be updated independently at a time that makes sense.
This is the recommended approach due to its simplicity and ability to update each dbt project independently.
Approach 2: Centralised dbt-bouncer.yml configuration file shared via git submodule#
Warning
With this approach, a change to the centralised dbt-bouncer.yml file may result in CI pipelines in dbt projects failing despite no changes being made to these projects. As such we recommend implementing this approach only after extensive discussion with all dbt project developers so that all dbt projects can be brought into line before dbt-bouncer is enforced in the CI pipeline.
Should it be necessary for a breaking change to be made to the centralised dbt-bouncer.yml configuration file, we recommend setting the severity of the relevant check to warn so that CI pipelines in dbt projects will not fail and maintainers have sufficient time to make the necessary changes.
Git submodules allow the contents from one repository to be accessible from a different repository. Such a setup for dbt-bouncer can be achieved as follows (this example uses GitHub, similar setups can be achieved with other providers):
-
Set up a dedicated repository to store a centralised
dbt-bouncer.ymlconfiguration file that will be used by all dbt projects. Let's call this repositorydbt-bouncer-config. -
The contents of the
dbt-bouncer.ymlfile indbt-bouncer-configshould contain the following configuration fordbt_artifacts_dir: -
In every repository add a git submodule via:
-
Run
dbt-bouncer:
Your directory tree should look like this:
.
├── dbt-bouncer-config
│ └── dbt-bouncer.yml
├── dbt_project.yml
├── macros
│ └── ...
├── models
│ └── ...
├── profiles.yml
├── README.md
└── target
├── catalog.json
├── manifest.json
└── run_results.json
Note: if you update your central dbt-bouncer.yml file, you will need to run git submodule update --remote in every repository to update the submodule.
How to set up dbt-bouncer with pre-commit?#
You can use the official pre-commit hook, in your .pre-commit-config.yaml file:
repos:
- repo: https://github.com/godatadriven/dbt-bouncer
rev: v1.19.0 # Check https://github.com/godatadriven/dbt-bouncer/releases for latest version
hooks:
- id: dbt-bouncer
args: ["--config-file", "<PATH_TO_CONFIG_FILE>"] # Optional
Alternatively, you can use a local hook to run automatically run dbt-bouncer before your commits get added to the git tree.