Manifest Checks: Lineage#

Note

The below checks require manifest.json to be present.

Functions:

Name	Description
`check_lineage_permitted_upstream_models`	Upstream models must have a path that matches the provided `upstream_path_pattern`.
`check_lineage_seed_cannot_be_used`	Seed cannot be referenced in models with a path that matches the specified `include` config.
`check_lineage_source_cannot_be_used`	Sources cannot be referenced in models with a path that matches the specified `include` config.

`check_lineage_permitted_upstream_models` #

Upstream models must have a path that matches the provided upstream_path_pattern.

Parameters:

Name	Type	Description	Default
`upstream_path_pattern`	`str`	Regexp pattern to match the upstream model(s) path.	required

Receives at execution time:

Name	Type	Description
`manifest_obj`	`ManifestObject`	The manifest object.
`model`	`ModelNode`	The ModelNode object to check.
`models`	`list[ModelNode]`	List of ModelNode objects parsed from `manifest.json`.

Other Parameters (passed via config file):

Name	Type	Description
`description`	`str \| None`	Description of what the check does and why it is implemented.
`exclude`	`str \| None`	Regex pattern to match the model path. Model paths that match the pattern will not be checked.
`include`	`str \| None`	Regex pattern to match the model path. Only model paths that match the pattern will be checked.
`severity`	`Literal[error, warn] \| None`	Severity level of the check. Default: `error`.

Example(s):

manifest_checks:
    - name: check_lineage_permitted_upstream_models
      include: ^models/staging
      upstream_path_pattern: $^
    - name: check_lineage_permitted_upstream_models
      include: ^models/intermediate
      upstream_path_pattern: ^models/staging|^models/intermediate
    - name: check_lineage_permitted_upstream_models
      include: ^models/marts
      upstream_path_pattern: ^models/staging|^models/intermediate

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py

@check
def check_lineage_permitted_upstream_models(
    model, ctx, *, package_name: str | None = None, upstream_path_pattern: str
):
    """Upstream models must have a path that matches the provided `upstream_path_pattern`.

    Parameters:
        upstream_path_pattern (str): Regexp pattern to match the upstream model(s) path.

    Receives:
        manifest_obj (ManifestObject): The manifest object.
        model (ModelNode): The ModelNode object to check.
        models (list[ModelNode]): List of ModelNode objects parsed from `manifest.json`.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_permitted_upstream_models
              include: ^models/staging
              upstream_path_pattern: $^
            - name: check_lineage_permitted_upstream_models
              include: ^models/intermediate
              upstream_path_pattern: ^models/staging|^models/intermediate
            - name: check_lineage_permitted_upstream_models
              include: ^models/marts
              upstream_path_pattern: ^models/staging|^models/intermediate
        ```

    """
    compiled_upstream_path_pattern = compile_pattern(upstream_path_pattern.strip())
    manifest_obj = ctx.manifest_obj
    upstream_models = [
        x
        for x in getattr(model.depends_on, "nodes", []) or []
        if x.split(".")[0] == "model"
        and x.split(".")[1]
        == (package_name or manifest_obj.manifest.metadata.project_name)
    ]
    models_by_id = (
        ctx.models_by_unique_id
        if ctx.models_by_unique_id
        else {m.unique_id: m for m in ctx.models}
    )
    not_permitted_upstream_models = [
        upstream_model
        for upstream_model in upstream_models
        if upstream_model in models_by_id
        and compiled_upstream_path_pattern.match(
            clean_path_str(models_by_id[upstream_model].original_file_path)
        )
        is None
    ]
    if not_permitted_upstream_models:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` references upstream models that are not permitted: {[m.split('.')[-1] for m in not_permitted_upstream_models]}."
        )

`check_lineage_seed_cannot_be_used` #

Seed cannot be referenced in models with a path that matches the specified include config.

Receives at execution time:

Name	Type	Description
`model`	`ModelNode`	The ModelNode object to check.

Other Parameters (passed via config file):

Name	Type	Description
`description`	`str \| None`	Description of what the check does and why it is implemented.
`exclude`	`str \| None`	Regex pattern to match the model path. Model paths that match the pattern will not be checked.
`include`	`str \| None`	Regex pattern to match the model path. Only model paths that match the pattern will be checked.
`severity`	`Literal[error, warn] \| None`	Severity level of the check. Default: `error`.

Example(s):

manifest_checks:
    - name: check_lineage_seed_cannot_be_used
      include: ^models/intermediate|^models/marts

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py

@check
def check_lineage_seed_cannot_be_used(model):
    """Seed cannot be referenced in models with a path that matches the specified `include` config.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_seed_cannot_be_used
              include: ^models/intermediate|^models/marts
        ```

    """
    if [
        x
        for x in getattr(model.depends_on, "nodes", []) or []
        if x.split(".")[0] == "seed"
    ]:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` references a seed even though this is not permitted."
        )

`check_lineage_source_cannot_be_used` #

Sources cannot be referenced in models with a path that matches the specified include config.

Receives at execution time:

Name	Type	Description
`model`	`ModelNode`	The ModelNode object to check.

Other Parameters (passed via config file):

Name	Type	Description
`description`	`str \| None`	Description of what the check does and why it is implemented.
`exclude`	`str \| None`	Regex pattern to match the model path. Model paths that match the pattern will not be checked.
`include`	`str \| None`	Regex pattern to match the model path. Only model paths that match the pattern will be checked.
`severity`	`Literal[error, warn] \| None`	Severity level of the check. Default: `error`.

Example(s):

manifest_checks:
    - name: check_lineage_source_cannot_be_used
      include: ^models/intermediate|^models/marts

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py

@check
def check_lineage_source_cannot_be_used(model):
    """Sources cannot be referenced in models with a path that matches the specified `include` config.

    Receives:
        model (ModelNode): The ModelNode object to check.

    Other Parameters:
        description (str | None): Description of what the check does and why it is implemented.
        exclude (str | None): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (str | None): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Literal["error", "warn"] | None): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_source_cannot_be_used
              include: ^models/intermediate|^models/marts
        ```

    """
    if [
        x
        for x in getattr(model.depends_on, "nodes", []) or []
        if x.split(".")[0] == "source"
    ]:
        fail(
            f"`{get_clean_model_name(model.unique_id)}` references a source even though this is not permitted."
        )

Manifest Checks: Lineage#

check_lineage_permitted_upstream_models #

check_lineage_seed_cannot_be_used #

check_lineage_source_cannot_be_used #

`check_lineage_permitted_upstream_models` #

`check_lineage_seed_cannot_be_used` #

`check_lineage_source_cannot_be_used` #