Skip to content

Directories#

Note

The below checks require manifest.json to be present.

Checks related to model file locations, names, and directory structure.

Functions:

Name Description
check_model_directories

Only specified sub-directories are permitted.

check_model_file_name

Models must have a file name that matches the supplied regex.

check_model_property_file_location

Model properties files must follow the guidance provided by dbt here.

check_model_schema_name

Models must have a schema name that matches the supplied regex.

check_model_directories #

Only specified sub-directories are permitted.

Parameters:

Name Type Description Default
include str

Regex pattern to the directory to check.

required
permitted_sub_directories list[str]

List of permitted sub-directories.

required

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
- name: check_model_directories
  include: models
  permitted_sub_directories:
    - intermediate
    - marts
    - staging
# Restrict sub-directories within `./models/staging`
- name: check_model_directories
  include: ^models/staging
  permitted_sub_directories:
    - crm
    - payments

Source code in src/dbt_bouncer/checks/manifest/models/directories.py
@check
def check_model_directories(
    model, *, include: str, permitted_sub_directories: list[str]
):
    """Only specified sub-directories are permitted.

    Parameters:
        include (str): Regex pattern to the directory to check.
        permitted_sub_directories (list[str]): List of permitted sub-directories.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
        - name: check_model_directories
          include: models
          permitted_sub_directories:
            - intermediate
            - marts
            - staging
        ```
        ```yaml
        # Restrict sub-directories within `./models/staging`
        - name: check_model_directories
          include: ^models/staging
          permitted_sub_directories:
            - crm
            - payments
        ```

    """
    compiled_include = compile_pattern(include.strip().rstrip("/"))
    clean_path = clean_path_str(model.original_file_path)
    matched_path = compiled_include.match(clean_path)
    if matched_path is None:
        fail("matched_path is None")
    path_after_match = clean_path[matched_path.end() + 1 :]
    directory_to_check = Path(path_after_match).parts[0]

    if directory_to_check.replace(".sql", "") == model.name:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` is not located in a valid sub-directory ({permitted_sub_directories})."
        )
    elif directory_to_check not in permitted_sub_directories:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` is located in the `{directory_to_check}` sub-directory, this is not a valid sub-directory ({permitted_sub_directories})."
        )

check_model_file_name #

Models must have a file name that matches the supplied regex.

Parameters:

Name Type Description Default
file_name_pattern str

Regexp the file name must match. Please account for the .sql extension.

required

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_file_name
      description: Marts must include the model version in their file name.
      include: ^models/marts
      file_name_pattern: .*(v[0-9])\.sql$

Source code in src/dbt_bouncer/checks/manifest/models/directories.py
@check
def check_model_file_name(model, *, file_name_pattern: str):
    r"""Models must have a file name that matches the supplied regex.

    Parameters:
        file_name_pattern (str): Regexp the file name must match. Please account for the `.sql` extension.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_file_name
              description: Marts must include the model version in their file name.
              include: ^models/marts
              file_name_pattern: .*(v[0-9])\.sql$
        ```

    """
    compiled = compile_pattern(file_name_pattern.strip())
    file_name = Path(clean_path_str(model.original_file_path)).name
    if compiled.match(file_name) is None:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` is in a file that does not match the supplied regex `{file_name_pattern.strip()}`."
        )

check_model_property_file_location #

Model properties files must follow the guidance provided by dbt here.

Parameters:

Name Type Description Default
model ModelNode

The ModelNode object to check.

required

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_property_file_location

Source code in src/dbt_bouncer/checks/manifest/models/directories.py
@check
def check_model_property_file_location(model):
    """Model properties files must follow the guidance provided by dbt [here](https://docs.getdbt.com/best-practices/how-we-structure/1-guide-overview).

    Parameters:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_property_file_location
        ```

    """
    if not (
        hasattr(model, "patch_path")
        and model.patch_path
        and clean_path_str(model.patch_path or "") is not None
    ):
        fail(f"`{get_clean_model_name(model.unique_id)}` is not documented.")

    original_path = Path(clean_path_str(model.original_file_path))
    relevant_parts = original_path.parts[1:-1]

    mapped_parts = []
    for part in relevant_parts:
        if part == "staging":
            mapped_parts.append("stg")
        elif part == "intermediate":
            mapped_parts.append("int")
        elif part == "marts":
            continue
        else:
            mapped_parts.append(part)

    expected_substr = "_".join(mapped_parts)
    properties_yml_name = Path(clean_path_str(model.patch_path or "")).name

    if not properties_yml_name.startswith("_"):
        fail(
            f"The properties file for `{get_clean_model_name(model.unique_id)}` (`{properties_yml_name}`) does not start with an underscore."
        )
    if expected_substr not in properties_yml_name:
        fail(
            f"The properties file for `{get_clean_model_name(model.unique_id)}` (`{properties_yml_name}`) does not contain the expected substring (`{expected_substr}`)."
        )
    if not properties_yml_name.endswith("__models.yml"):
        fail(
            f"The properties file for `{get_clean_model_name(model.unique_id)}` (`{properties_yml_name}`) does not end with `__models.yml`."
        )

check_model_schema_name #

Models must have a schema name that matches the supplied regex.

Note that most setups will use schema names in development that are prefixed, for example: * dbt_jdoe_stg_payments * mary_stg_payments

Please account for this if you wish to run dbt-bouncer against locally generated manifests.

Parameters:

Name Type Description Default
schema_name_pattern str

Regexp the schema name must match.

required

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

materialization Literal[ephemeral, incremental, table, view] | None

Limit check to models with the specified materialization.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_model_schema_name
      include: ^models/intermediate
      schema_name_pattern: .*intermediate # Accounting for schemas like `dbt_jdoe_intermediate`.
    - name: check_model_schema_name
      include: ^models/staging
      schema_name_pattern: .*stg_.*

Source code in src/dbt_bouncer/checks/manifest/models/directories.py
@check
def check_model_schema_name(model, *, schema_name_pattern: str):
    """Models must have a schema name that matches the supplied regex.

    Note that most setups will use schema names in development that are prefixed, for example:
        * dbt_jdoe_stg_payments
        * mary_stg_payments

    Please account for this if you wish to run `dbt-bouncer` against locally generated manifests.

    Parameters:
        schema_name_pattern (str): Regexp the schema name must match.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        materialization (Literal["ephemeral", "incremental", "table", "view"] | None): Limit check to models with the specified materialization.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_model_schema_name
              include: ^models/intermediate
              schema_name_pattern: .*intermediate # Accounting for schemas like `dbt_jdoe_intermediate`.
            - name: check_model_schema_name
              include: ^models/staging
              schema_name_pattern: .*stg_.*
        ```

    """
    compiled = compile_pattern(schema_name_pattern.strip())
    if compiled.match(str(model.schema_)) is None:
        fail(
            f"`{model.schema_}` does not match the supplied regex `{schema_name_pattern.strip()})`."
        )