Skip to content

Manifest Checks: Lineage#

Note

The below checks require manifest.json to be present.

Classes:

Name Description
CheckLineagePermittedUpstreamModels

Upstream models must have a path that matches the provided upstream_path_pattern.

CheckLineageSeedCannotBeUsed

Seed cannot be referenced in models with a path that matches the specified include config.

CheckLineageSourceCannotBeUsed

Sources cannot be referenced in models with a path that matches the specified include config.

CheckLineagePermittedUpstreamModels #

Upstream models must have a path that matches the provided upstream_path_pattern.

Parameters:

Name Type Description Default
upstream_path_pattern str

Regexp pattern to match the upstream model(s) path.

required

Receives at execution time:

Name Type Description
manifest_obj DbtBouncerManifest

The manifest object.

model DbtBouncerModelBase

The DbtBouncerModelBase object to check.

models List[DbtBouncerModelBase]

List of DbtBouncerModelBase objects parsed from manifest.json.

Other Parameters (passed via config file):

Name Type Description
exclude Optional[str]

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include Optional[str]

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Optional[Literal['error', 'warn']]

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_lineage_permitted_upstream_models
      include: ^models/staging
      upstream_path_pattern: $^
    - name: check_lineage_permitted_upstream_models
      include: ^models/intermediate
      upstream_path_pattern: ^models/staging|^models/intermediate
    - name: check_lineage_permitted_upstream_models
      include: ^models/marts
      upstream_path_pattern: ^models/staging|^models/intermediate

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py
class CheckLineagePermittedUpstreamModels(BaseCheck):
    """Upstream models must have a path that matches the provided `upstream_path_pattern`.

    Parameters:
        upstream_path_pattern (str): Regexp pattern to match the upstream model(s) path.

    Receives:
        manifest_obj (DbtBouncerManifest): The manifest object.
        model (DbtBouncerModelBase): The DbtBouncerModelBase object to check.
        models (List[DbtBouncerModelBase]): List of DbtBouncerModelBase objects parsed from `manifest.json`.

    Other Parameters:
        exclude (Optional[str]): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (Optional[str]): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Optional[Literal["error", "warn"]]): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_permitted_upstream_models
              include: ^models/staging
              upstream_path_pattern: $^
            - name: check_lineage_permitted_upstream_models
              include: ^models/intermediate
              upstream_path_pattern: ^models/staging|^models/intermediate
            - name: check_lineage_permitted_upstream_models
              include: ^models/marts
              upstream_path_pattern: ^models/staging|^models/intermediate
        ```

    """

    manifest_obj: "DbtBouncerManifest" = Field(default=None)
    model: "DbtBouncerModelBase" = Field(default=None)
    models: List["DbtBouncerModelBase"] = Field(default=[])
    name: Literal["check_lineage_permitted_upstream_models"]
    upstream_path_pattern: str

    def execute(self) -> None:
        """Execute the check."""
        upstream_models = [
            x
            for x in self.model.depends_on.nodes
            if x.split(".")[0] == "model"
            and x.split(".")[1] == self.manifest_obj.manifest.metadata.project_name
        ]
        not_permitted_upstream_models = [
            upstream_model
            for upstream_model in upstream_models
            if re.compile(self.upstream_path_pattern.strip()).match(
                clean_path_str(
                    next(
                        m for m in self.models if m.unique_id == upstream_model
                    ).original_file_path
                ),
            )
            is None
        ]
        assert not not_permitted_upstream_models, (
            f"`{self.model.name}` references upstream models that are not permitted: {[m.split('.')[-1] for m in not_permitted_upstream_models]}."
        )

CheckLineageSeedCannotBeUsed #

Seed cannot be referenced in models with a path that matches the specified include config.

Receives at execution time:

Name Type Description
model DbtBouncerModelBase

The DbtBouncerModelBase object to check.

Other Parameters (passed via config file):

Name Type Description
exclude Optional[str]

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include Optional[str]

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Optional[Literal['error', 'warn']]

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_lineage_seed_cannot_be_used
      include: ^models/intermediate|^models/marts

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py
class CheckLineageSeedCannotBeUsed(BaseCheck):
    """Seed cannot be referenced in models with a path that matches the specified `include` config.

    Receives:
        model (DbtBouncerModelBase): The DbtBouncerModelBase object to check.

    Other Parameters:
        exclude (Optional[str]): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (Optional[str]): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Optional[Literal["error", "warn"]]): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_seed_cannot_be_used
              include: ^models/intermediate|^models/marts
        ```

    """

    model: "DbtBouncerModelBase" = Field(default=None)
    name: Literal["check_lineage_seed_cannot_be_used"]

    def execute(self) -> None:
        """Execute the check."""
        assert not [
            x for x in self.model.depends_on.nodes if x.split(".")[0] == "seed"
        ], f"`{self.model.name}` references a seed even though this is not permitted."

CheckLineageSourceCannotBeUsed #

Sources cannot be referenced in models with a path that matches the specified include config.

Receives at execution time:

Name Type Description
model DbtBouncerModelBase

The DbtBouncerModelBase object to check.

Other Parameters (passed via config file):

Name Type Description
exclude Optional[str]

Regex pattern to match the model path. Model paths that match the pattern will not be checked.

include Optional[str]

Regex pattern to match the model path. Only model paths that match the pattern will be checked.

severity Optional[Literal['error', 'warn']]

Severity level of the check. Default: error.

Example(s):

manifest_checks:
    - name: check_lineage_source_cannot_be_used
      include: ^models/intermediate|^models/marts

Source code in src/dbt_bouncer/checks/manifest/check_lineage.py
class CheckLineageSourceCannotBeUsed(BaseCheck):
    """Sources cannot be referenced in models with a path that matches the specified `include` config.

    Receives:
        model (DbtBouncerModelBase): The DbtBouncerModelBase object to check.

    Other Parameters:
        exclude (Optional[str]): Regex pattern to match the model path. Model paths that match the pattern will not be checked.
        include (Optional[str]): Regex pattern to match the model path. Only model paths that match the pattern will be checked.
        severity (Optional[Literal["error", "warn"]]): Severity level of the check. Default: `error`.

    Example(s):
        ```yaml
        manifest_checks:
            - name: check_lineage_source_cannot_be_used
              include: ^models/intermediate|^models/marts
        ```

    """

    model: "DbtBouncerModelBase" = Field(default=None)
    name: Literal["check_lineage_source_cannot_be_used"]

    def execute(self) -> None:
        """Execute the check."""
        assert not [
            x for x in self.model.depends_on.nodes if x.split(".")[0] == "source"
        ], f"`{self.model.name}` references a source even though this is not permitted."