Hi all I am evaluation if pants can be used to manage our da Pants #general

Hi all. I am evaluation if pants can be used to ma...

crooked-country-1937

08/08/2022, 5:41 AM

Hi all. I am evaluation if pants can be used to manage our data pipelines codebase. Its mostly Airflow / python codebase. I have this test which is giving me issues

Copy code

import pytest
from airflow.models import DagBag

def test_no_import_errors():
    dag_bag = DagBag(
        dag_folder="src/airflow-dags/tunein/dags",
        include_examples=False,
        read_dags_from_db=False
    )
    assert dag_bag.size() > 0, "Loaded dags"
    assert len(dag_bag.import_errors) == 0, "No Import Failures"

Generally, unit testing airflow code requires creating

DagBag()

which basically scans the

dags

folder to load the DAG. Any ideas on how to get tests like this working with Pants?

refined-addition-53644

08/08/2022, 6:37 AM

It will be helpful if you post the error you're getting.

abundant-leather-17386

08/08/2022, 7:17 AM

You need to add the

.py

files where you define your DAGs explicitly as

dependencies

field I think. That’s how I got it to work.

abundant-leather-17386

08/08/2022, 7:20 AM

In our case I added an

overrides

field to

python_tests

Copy code

overrides={
    "path/to/load_dagbag_test.py": {
        "dependencies": ["airflow-dags:lib"]
    }
},

where

"airflow-dags:lib"

refers to the

python_sources

containing the DAGs code.

happy-kitchen-89482

08/08/2022, 12:22 PM

As @abundant-leather-17386 correctly mentions, this is probably due to a missing dependency. Normally, if a.py imports b.py then Pants can infer that dependency from the

import

statements at build time. But in this case the scanning happens at test runtime so there is no

import

statement to infer from. So you tell Pants about the dependency manually.

3 Views

Open in Slack

Previous Next