Skip to content

Manifest Checks: Lineage#

Note

The below checks require manifest.json to be present.

Functions:

Name Description
check_lineage_permitted_upstream_models

Upstream models must have a path that matches the provided upstream_path_pattern.

check_lineage_seed_cannot_be_used

Seed cannot be referenced in models with a path that matches the specified include config.

check_lineage_source_cannot_be_used

Sources cannot be referenced in models with a path that matches the specified include config.

check_lineage_permitted_upstream_models #

Upstream models must have a path that matches the provided upstream_path_pattern.

Parameters:

Name Type Description Default
upstream_path_pattern str

Regexp pattern to match the upstream model(s) path.

required

Receives at execution time:

Name Type Description
manifest_obj ManifestObject

The manifest object.

model ModelNode

The ModelNode object to check.

models list[ModelNode]

List of ModelNode objects parsed from manifest.json.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_lineage_permitted_upstream_models
      include: ^models/staging
      upstream_path_pattern: $^
    - name: check_lineage_permitted_upstream_models
      include: ^models/intermediate
      upstream_path_pattern: ^models/staging|^models/intermediate
    - name: check_lineage_permitted_upstream_models
      include: ^models/marts
      upstream_path_pattern: ^models/staging|^models/intermediate

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py
@check
def check_lineage_permitted_upstream_models(
    model, ctx, *, package_name: str | None = None, upstream_path_pattern: str
):
    """Upstream models must have a path that matches the provided `upstream_path_pattern`.

    Parameters:
        upstream_path_pattern (str): Regexp pattern to match the upstream model(s) path.

    Receives:
        manifest_obj (ManifestObject): The manifest object.
        model (ModelNode): The ModelNode object to check.
        models (list[ModelNode]): List of ModelNode objects parsed from `manifest.json`.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_permitted_upstream_models
              include: ^models/staging
              upstream_path_pattern: $^
            - name: check_lineage_permitted_upstream_models
              include: ^models/intermediate
              upstream_path_pattern: ^models/staging|^models/intermediate
            - name: check_lineage_permitted_upstream_models
              include: ^models/marts
              upstream_path_pattern: ^models/staging|^models/intermediate
        ```

    """
    compiled_upstream_path_pattern = compile_pattern(upstream_path_pattern.strip())
    manifest_obj = ctx.manifest_obj
    upstream_models = [
        x
        for x in getattr(model.depends_on, "nodes", []) or []
        if x.split(".")[0] == "model"
        and x.split(".")[1]
        == (package_name or manifest_obj.manifest.metadata.project_name)
    ]
    models_by_id = (
        ctx.models_by_unique_id
        if ctx.models_by_unique_id
        else {m.unique_id: m for m in ctx.models}
    )
    not_permitted_upstream_models = [
        upstream_model
        for upstream_model in upstream_models
        if upstream_model in models_by_id
        and compiled_upstream_path_pattern.match(
            clean_path_str(models_by_id[upstream_model].original_file_path)
        )
        is None
    ]
    if not_permitted_upstream_models:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` references upstream models that are not permitted: {[m.split('.')[-1] for m in not_permitted_upstream_models]}."
        )

check_lineage_seed_cannot_be_used #

Seed cannot be referenced in models with a path that matches the specified include config.

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_lineage_seed_cannot_be_used
      include: ^models/intermediate|^models/marts

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py
@check
def check_lineage_seed_cannot_be_used(model):
    """Seed cannot be referenced in models with a path that matches the specified `include` config.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_seed_cannot_be_used
              include: ^models/intermediate|^models/marts
        ```

    """
    if [
        x
        for x in getattr(model.depends_on, "nodes", []) or []
        if x.split(".")[0] == "seed"
    ]:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` references a seed even though this is not permitted."
        )

check_lineage_source_cannot_be_used #

Sources cannot be referenced in models with a path that matches the specified include config.

Receives at execution time:

Name Type Description
model ModelNode

The ModelNode object to check.

Other Parameters (passed via config file):

Name Type Description
description str | None

Description of what the check does and why it is implemented.

exclude str | None

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include str | None

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Literal[error, warn] | None

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_lineage_source_cannot_be_used
      include: ^models/intermediate|^models/marts

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py
@check
def check_lineage_source_cannot_be_used(model):
    """Sources cannot be referenced in models with a path that matches the specified `include` config.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_source_cannot_be_used
              include: ^models/intermediate|^models/marts
        ```

    """
    if [
        x
        for x in getattr(model.depends_on, "nodes", []) or []
        if x.split(".")[0] == "source"
    ]:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` references a source even though this is not permitted."
        )