Hello I have a question about excluding transitive dependenc Pants #general

Hello! I have a question about excluding transitiv...

stale-truck-36011

08/23/2024, 7:41 AM

Hello! I have a question about excluding transitive dependencies in targets using

!!//path/to:target

. I'm not sure if the behavior I'm seeing is a bug, or if I'm misunderstanding something. I'm running pants

2.21.0

There's an example

BUILD

and run steps in the thread I basically have a structure like in this graph. We use transitive dependencies to figure out which unit tests to run depending on changes in our code base. The test depends on a utility function, that in turn depends on two sets of other dependencies

and

are files of the same type, but consider

as test/mock data, and

as real data. I would like to exclude

, and transitively exclude

B.1

and

B.2

from the dependencies of

test

, so that changes to these do not trigger the test. It's still necessary that

utils

depend on these. Here's the issue: Let's say all files are in

src/ex

test

is a

python_test

and

are

target

, the rest are

python_source

I add

!!:B

to the dependencies of

test

I run

pants dependencies --transitive src/ex:test

and see that

is indeed not a dependency anymore. However,

B.1

and

B.2

are listed. But if I run

pants paths --from='src/ex:test' --to='src/ex:B.1'

it returns an empty list, showing that there's no path from

test

B.1

. Removing the

!!

line makes this command return the expected path via

utils

and

. it returns an empty list. Removing the

stale-truck-36011

08/23/2024, 7:46 AM

Here's an example BUILD file:

Copy code

target(
    name="A",
    dependencies=[
        ":A.1",
    ],
)
target(
    name="B",
    dependencies=[
        ":B.1",
        ":B.2",
    ],
)

python_source(
    name="A.1",
    source="A.1.py",
)
python_source(
    name="B.1",
    source="B.1.py",
)
python_source(
    name="B.2",
    source="B.2.py",
)

python_source(
    name="utils",
    source="utils.py",
    dependencies=[
        ":A",
        ":B",
    ],
)

python_test(
    name="tests",
    source="tests.py",
    dependencies=[
        ":utils",
        "!!:B",
    ],
)

Create empty files with

Copy code

mkdir -p src/ex
cd src/ex
touch tests.py utils.py A.1.py B.1.py B.2.py
cd -

Then you can run

Copy code

pants dependencies --format=json --transitive src/ex:tests

stale-truck-36011

08/23/2024, 9:03 AM

To add what my goal is - because these commands seem to output different results when it comes to transitive dependencies. This is the command we use to run our tests:

Copy code

pants --changed-since=HEAD~1 --changed-dependees=transitive test

I want this command to not trigger

tests

to run when

B.1.py

B.2.py

are changed. Only when

A.1.py

is changed.

stale-truck-36011

08/23/2024, 9:15 AM

Actually, explicitly setting

!!:B.1

and

!!:B.2

tests

dependencies

doesn't exclude them at all. If I do that, edit

B.1.py

, and then run

Copy code

pants --changed-since=HEAD~1  --changed-dependees=transitive list

I see both

tests

and

B.1

- indicating that the tests would run when changing one of the excluded dependencies. I'm leaning towards this being a bug. Nevertheless it's very confusing that

pants paths

pants --changed-dependees=transitive test/list

, and

pants dependencies --transitive

all produce different results from each other regarding transitive dependencies so I will probably look for another approach.

stale-truck-36011

08/26/2024, 7:52 AM

Bump, I would still like some support on this 🙂

square-psychiatrist-19087

08/26/2024, 6:07 PM

your example looks overly complicated, you can show the same issue with only

tests

util

and

right? it will be much easier to understand

stale-truck-36011

08/26/2024, 6:45 PM

Sure, the same would hold if only

tests

utils

and

existed, with this graph:

Copy code

tests --> utils ----> A
             |------> B

Adding`!!:utils` in the dependencies list for

tests

would still have

and

show up as a transitive dependence. The first example is designed after my already existing dependency graph.

square-psychiatrist-19087

08/26/2024, 9:20 PM

Does

!:utils

work for you?

stale-truck-36011

08/27/2024, 11:30 AM

Thanks for taking the time to respond 🙂 Setting it up again with the smaller example. Here's the BUILD file now, this is graphed in the screenshot. The files are located in

src/backend/ex

, and

src

and

src/backend

are configured as pants roots. There are empty

__init__.py

files in the source dirs.

test.py

imports a function from

utils.py

utils.py

imports functions from

A.py

and

B.py

A.py

imports a function from

C.py

No excludes in the dependencies for

tests

reports

Copy code

➜ ./pants dependencies --transitive --format=json src/backend/ex:tests
Checking Pants installation...
{
    "src/backend/ex:tests": [
        "src/backend/ex:A",
        "src/backend/ex:B",
        "src/backend/ex:C",
        "src/backend/ex:utils"
    ]
}

If I do

!:utils

!!:utils

Copy code

{
  "src/backend/ex:tests": []
}

If I do

!:A

I get

Copy code

{
  "src/backend/ex:tests": [
    "src/backend/ex:A",
    "src/backend/ex:B",
    "src/backend/ex:C",
    "src/backend/ex:utils"
  ]
}

If I do

!!:A

I get

Copy code

{
  "src/backend/ex:tests": [
    "src/backend/ex:B",
    "src/backend/ex:C",
    "src/backend/ex:utils"
  ]
}

In the last case, I also expect

:C

to be excluded from the dependencies, not only

:A

stale-truck-36011

08/27/2024, 11:31 AM

Excluding

:utils

completely is not preferable, I want to depend on utils but only part of its dependencies. But maybe this is straight up a poor design choice.

square-psychiatrist-19087

08/27/2024, 11:38 AM

ok, it makes sense, but I think it's trickier to get right, imagine B also depends on C, then should you exclude C or not?

stale-truck-36011

08/27/2024, 11:42 AM

If I exclude

:A

, and

:B

depended on

:C

as well, then

:B

and

:C

should be included in the dependencies yeah. To be a bit more concrete about why we have this setup:

and

are configuration files (written in python) for specific use cases, and

utils

contains helper functions for loading these from disk.

can be seen as real data, and

as mock data, and we don't want to depend on the real data in our unit tests.

square-psychiatrist-19087

08/27/2024, 11:43 AM

Yeah, what you describe is a much more complicated logic that involves the graph analysis. Pants implementation is much simpler, it's just a set difference:

square-psychiatrist-19087

08/27/2024, 11:43 AM

https://github.com/pantsbuild/pants/blob/fb1c6a381520d511c9c12b0598561fb7fe8afb4d/src/python/pants/engine/internals/graph.py#L879-L884

square-psychiatrist-19087

08/27/2024, 11:43 AM

Copy code

transitive_excludes = await Get(Targets, Addresses(transitive_exclude_addresses))

    return TransitiveTargets(
        tuple(dependency_mapping.roots_as_targets),
        FrozenOrderedSet(dependency_mapping.visited.difference(transitive_excludes)),
    )

square-psychiatrist-19087

08/27/2024, 11:44 AM

first we fetch all dependencies, then fetch excluded targets and then just calculate the set difference

square-psychiatrist-19087

08/27/2024, 11:44 AM

you can read the whole rule, it's not very long https://github.com/pantsbuild/pants/blob/fb1c6a381520d511c9c12b0598561fb7fe8afb4d/src/python/pants/engine/internals/graph.py#L819-L884

stale-truck-36011

08/27/2024, 11:47 AM

Ah, gotcha, this all makes sense. So it's intended behavior based on the implementation. I think maybe what I'm asking for would complicate quite a bit of the dependency inference ^^ The transitive excludes function don't seem to be very well documented, so it was a bit of trial and error. I only saw it mentioned in a callout box, and in github issues. Thanks for taking the time! I think we'll look into alternative approaches 🙂

square-psychiatrist-19087

08/27/2024, 11:48 AM

you could probably implement a new feature like

~:B

to do the real subgraph excludes, I guess somewhere around here https://github.com/pantsbuild/pants/blob/fb1c6a381520d511c9c12b0598561fb7fe8afb4d/src/python/pants/engine/internals/graph.py#L762-L816

square-psychiatrist-19087

08/27/2024, 11:50 AM

but yeah, it's probably easier to choose some different workaround, e.g. remove utils dependencies on :A and :B and instead add these dependencies to all users of utils like tests or sources

stale-truck-36011

08/27/2024, 11:51 AM

That's exactly the choice I went with in the meantime 🙂 I think the dependency structure we have is poorly formulated, so this might just provide the opportunity for a needed refactoring.

square-psychiatrist-19087

08/27/2024, 11:51 AM

e.g you can split utils into

Copy code

def load_resource_a()
def load_resource_b()
def process_resource(r: A | B)

and move

load_resource_a

to A and

load_resource_b

to B

square-psychiatrist-19087

08/27/2024, 11:52 AM

this way users can choose which resource to load themselves, this will make the api more pants friendly

stale-truck-36011

08/27/2024, 12:05 PM

Good suggestion, I'll keep this in mind 🙂 Thanks!

2 Views

Open in Slack

Previous Next