Skip to content

Columns#

Note

The below checks require manifest.json to be present.

Checks related to model column definitions, types, and constraints.

Functions:

Name Description
check_model_columns_have_relationship_tests

Columns matching a regex pattern must have a relationships test, optionally validating the target column and model.

check_model_columns_have_meta_keys

Columns defined for models must have the specified keys in the meta config.

check_model_columns_have_types

Columns defined for models must have a data_type declared.

check_model_has_constraints

Table and incremental models must have the specified constraint types defined.

check_model_columns_have_relationship_tests #

Columns matching a regex pattern must have a relationships test, optionally validating the target column and model.

Rationale

Foreign-key columns that are never validated with a relationships test can silently contain orphaned IDs, leading to incorrect join results and data quality issues that are hard to trace. This check ensures that columns following a naming convention (e.g. _fk) are always backed by a referential integrity test.

Parameters:

Name Type Description Default
column_name_pattern str

Regex pattern to match column names that require a relationships test.

required
target_column_pattern str | None

Regex pattern the target column (field) of the relationships test must match. If not provided, any target column is accepted.

None
target_model_pattern str | None

Regex pattern the target model of the relationships test must match. If not provided, any target model is accepted.

None

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_columns_have_relationship_tests
      column_name_pattern: "_fk$"
manifest_checks:
    - name: check_model_columns_have_relationship_tests
      column_name_pattern: "_fk$"
      target_column_pattern: "_pk$"
      target_model_pattern: "^dim_|^fact_"

Source code in src/dbt_bouncer/checks/manifest/models/columns.py
@check
def check_model_columns_have_relationship_tests(
    model,
    ctx,
    *,
    column_name_pattern: str,
    target_column_pattern: str | None = None,
    target_model_pattern: str | None = None,
):
    """Columns matching a regex pattern must have a `relationships` test, optionally validating the target column and model.

    !!! info "Rationale"

        Foreign-key columns that are never validated with a `relationships` test can silently contain orphaned IDs, leading to incorrect join results and data quality issues that are hard to trace. This check ensures that columns following a naming convention (e.g. `_fk`) are always backed by a referential integrity test.

    Parameters:
        column_name_pattern (str): Regex pattern to match column names that require a relationships test.
        target_column_pattern (str | None): Regex pattern the target column (`field`) of the relationships test must match. If not provided, any target column is accepted.
        target_model_pattern (str | None): Regex pattern the target model of the relationships test must match. If not provided, any target model is accepted.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_columns_have_relationship_tests
              column_name_pattern: "_fk$"
        ```
        ```yaml
        manifest_checks:
            - name: check_model_columns_have_relationship_tests
              column_name_pattern: "_fk$"
              target_column_pattern: "_pk$"
              target_model_pattern: "^dim_|^fact_"
        ```

    """
    columns = model.columns or {}
    failing_columns: dict[str, str] = {}

    # Find all relationships tests attached to this model
    relationship_tests = []
    for test in ctx.tests:
        test_metadata = getattr(test, "test_metadata", None)
        attached_node = getattr(test, "attached_node", None)
        if (
            test_metadata
            and attached_node == model.unique_id
            and getattr(test_metadata, "name", "") == "relationships"
        ):
            relationship_tests.append(test_metadata)

    for col_name in columns:
        if not re.search(column_name_pattern, col_name):
            continue

        # Find a relationships test for this column
        matching_test = None
        for test_meta in relationship_tests:
            kwargs = getattr(test_meta, "kwargs", {}) or {}
            if isinstance(kwargs, dict):
                test_col = kwargs.get("column_name", "")
            else:
                test_col = getattr(kwargs, "column_name", "")
            if test_col == col_name:
                matching_test = test_meta
                break

        if matching_test is None:
            failing_columns[col_name] = "no relationships test found"
            continue

        kwargs = getattr(matching_test, "kwargs", {}) or {}
        if isinstance(kwargs, dict):
            target_field = kwargs.get("field", "")
            target_to = kwargs.get("to", "")
        else:
            target_field = getattr(kwargs, "field", "")
            target_to = getattr(kwargs, "to", "")

        if target_column_pattern and not re.search(target_column_pattern, target_field):
            failing_columns[col_name] = (
                f'target column "{target_field}" does not match pattern "{target_column_pattern}"'
            )
            continue

        if target_model_pattern:
            # Extract model name from ref('model_name') or source('source', 'table')
            ref_match = re.search(r"ref\(['\"](\w+)['\"]\)", target_to)
            target_model_name = ref_match.group(1) if ref_match else target_to
            if not re.search(target_model_pattern, target_model_name):
                failing_columns[col_name] = (
                    f'target model "{target_model_name}" does not match pattern "{target_model_pattern}"'
                )

    if failing_columns:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` has columns missing required `relationships` tests: {failing_columns}"
        )

check_model_columns_have_meta_keys #

Columns defined for models must have the specified keys in the meta config.

Rationale

Column-level metadata such as owner or pii flags is essential for data governance, access control, and cataloguing. Without enforcement, metadata is applied inconsistently, making it difficult to identify sensitive columns or assign accountability across a large project.

Parameters:

Name Type Description Default
keys NestedDict

A list (that may contain sub-lists) of required keys.

required

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_columns_have_meta_keys
      keys:
        - owner
        - pii

Source code in src/dbt_bouncer/checks/manifest/models/columns.py
@check
def check_model_columns_have_meta_keys(model, *, keys: NestedDict):
    """Columns defined for models must have the specified keys in the `meta` config.

    !!! info "Rationale"

        Column-level metadata such as `owner` or `pii` flags is essential for data governance, access control, and cataloguing. Without enforcement, metadata is applied inconsistently, making it difficult to identify sensitive columns or assign accountability across a large project.

    Parameters:
        keys (NestedDict): A list (that may contain sub-lists) of required keys.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_columns_have_meta_keys
              keys:
                - owner
                - pii
        ```

    """
    columns = model.columns or {}
    failing_columns: dict[str, list[str]] = {}
    for col_name, col in columns.items():
        missing_keys = find_missing_meta_keys(
            meta_config=col.meta or {}, required_keys=keys.model_dump()
        )
        if missing_keys:
            failing_columns[col_name] = [k.replace(">>", "") for k in missing_keys]
    if failing_columns:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` has columns missing required `meta` keys: {failing_columns}"
        )

check_model_columns_have_types #

Columns defined for models must have a data_type declared.

Rationale

Declaring column data types is a prerequisite for enforced dbt contracts and enables downstream consumers to understand the expected format of each field without querying the warehouse. It also prevents type-mismatch errors in tools that consume the schema at build time.

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_columns_have_types
      include: ^models/marts

Source code in src/dbt_bouncer/checks/manifest/models/columns.py
@check
def check_model_columns_have_types(model):
    """Columns defined for models must have a `data_type` declared.

    !!! info "Rationale"

        Declaring column data types is a prerequisite for enforced dbt contracts and enables downstream consumers to understand the expected format of each field without querying the warehouse. It also prevents type-mismatch errors in tools that consume the schema at build time.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_columns_have_types
              include: ^models/marts
        ```

    """
    columns = model.columns or {}
    untyped_columns = [
        col_name for col_name, col in columns.items() if not col.data_type
    ]
    if untyped_columns:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` has columns without a declared `data_type`: {untyped_columns}"
        )

check_model_has_constraints #

Table and incremental models must have the specified constraint types defined.

Rationale

Database constraints such as primary_key and not_null enforce data integrity at the warehouse level, providing a safety net that goes beyond dbt tests. Requiring them on materialised models ensures that quality guarantees survive even when dbt tests are skipped or not run on every refresh.

Parameters:

Name Type Description Default
required_constraint_types list[Literal[check, custom, foreign_key, not_null, primary_key, unique]]

List of constraint types that must be present on the model.

required

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_has_constraints
      required_constraint_types:
        - primary_key
      include: ^models/marts

Source code in src/dbt_bouncer/checks/manifest/models/columns.py
@check
def check_model_has_constraints(model, *, required_constraint_types: list[str]):
    """Table and incremental models must have the specified constraint types defined.

    !!! info "Rationale"

        Database constraints such as `primary_key` and `not_null` enforce data integrity at the warehouse level, providing a safety net that goes beyond dbt tests. Requiring them on materialised models ensures that quality guarantees survive even when dbt tests are skipped or not run on every refresh.

    Parameters:
        required_constraint_types (list[Literal["check", "custom", "foreign_key", "not_null", "primary_key", "unique"]]): List of constraint types that must be present on the model.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_has_constraints
              required_constraint_types:
                - primary_key
              include: ^models/marts
        ```

    """
    materialization = (
        model.config.materialized
        if model.config and hasattr(model.config, "materialized")
        else None
    )
    if materialization not in (Materialization.TABLE, Materialization.INCREMENTAL):
        return
    constraints = model.constraints or []
    actual_types: set[str] = set()
    for c in constraints:
        c_type = getattr(c, "type")  # noqa: B009 - avoids ty shadowing of builtin `type`
        actual_types.add(c_type.value if hasattr(c_type, "value") else str(c_type))
    missing_types = sorted(set(required_constraint_types) - actual_types)
    if missing_types:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` is missing required constraint types: {missing_types}"
        )