https://pantsbuild.org/ logo
#general
Title
# general
c

crooked-country-1937

08/08/2022, 5:41 AM
Hi all. I am evaluation if pants can be used to manage our data pipelines codebase. Its mostly Airflow / python codebase. I have this test which is giving me issues
Copy code
import pytest
from airflow.models import DagBag

def test_no_import_errors():
    dag_bag = DagBag(
        dag_folder="src/airflow-dags/tunein/dags",
        include_examples=False,
        read_dags_from_db=False
    )
    assert dag_bag.size() > 0, "Loaded dags"
    assert len(dag_bag.import_errors) == 0, "No Import Failures"
Generally, unit testing airflow code requires creating
DagBag()
which basically scans the
dags
folder to load the DAG. Any ideas on how to get tests like this working with Pants?
r

refined-addition-53644

08/08/2022, 6:37 AM
It will be helpful if you post the error you're getting.
a

abundant-leather-17386

08/08/2022, 7:17 AM
You need to add the
.py
files where you define your DAGs explicitly as
dependencies
field I think. That’s how I got it to work.
In our case I added an
overrides
field to
python_tests
:
Copy code
overrides={
    "path/to/load_dagbag_test.py": {
        "dependencies": ["airflow-dags:lib"]
    }
},
where
"airflow-dags:lib"
refers to the
python_sources
containing the DAGs code.
h

happy-kitchen-89482

08/08/2022, 12:22 PM
As @abundant-leather-17386 correctly mentions, this is probably due to a missing dependency. Normally, if a.py imports b.py then Pants can infer that dependency from the
import
statements at build time. But in this case the scanning happens at test runtime so there is no
import
statement to infer from. So you tell Pants about the dependency manually.
3 Views