Curious on opinions about organizing dependencies like files Pants #general

Curious on opinions about organizing dependencies ...

high-yak-85899

01/27/2022, 9:31 PM

Curious on opinions about organizing dependencies like files for testdata. Let's say I have a bunch of

*.py

and

*_test.py

modules under

src/app

that I'm working with.

tailor

will conveniently make me a BUILD file with

python_sources

and

python_tests

declared that expands everything for me. Let's say I have one test in there,

some_test.py

that needs files stored in

src/app/testdata

. My suspicion is that if I were to add something like

Copy code

files(name="testdata", sources=["testdata/**"])

to the BUILD file and change

python_tests()

python_tests(dependencies=[":testdata"])

, any change to what is in

testdata

would invalidate a bunch of caching and cause all the tests to get rerun even though I only needed it in

some_test.py

high-yak-85899

01/27/2022, 9:31 PM

I can do

Copy code

python_tests(overrides={'some_test.py': {'dependencies': [':testdata']}})

to isolate it, but feels a little clunky after doing it a couple times.

high-yak-85899

01/27/2022, 9:32 PM

I don't think I can write my own

python_test

target for just that file as it will cause a BUILD error for duplicate targets. Maybe I'm misremembering what I have already tried.

hundreds-father-404

01/27/2022, 9:35 PM

any change to what is in testdata would invalidate a bunch of caching and cause all the tests to get rerun even though I only needed it in some_test.py.

That is correct.

python_tests(overrides={'some_test.py': {'dependencies': [':testdata']}})

This is indeed the most precise way to do things. If you need it for multiple files, you'd use

('some_test.py', 'another_test.py'):

as the key

I don't think I can write my own python_test target for just that file as it will cause a BUILD error for duplicate targets.

You could, but you're right you would have two targets. This shouldn't cause Pants to crash, but it will cause Pants to run the same tests twice because it thinks that you have two separate metadatas for the same file The way to avoid that is to update the

sources

of the

python_tests

with

'!some_test.py'

. But that's clunky to do and imo an antipattern, hence

overrides

hundreds-father-404

01/27/2022, 9:36 PM

It's not "wrong" per se to apply the metadata to everything generated by the

python_tests

target...but it makes your caching less useful. We encourage liberally using

overrides

, but it's not a requirement

high-yak-85899

01/27/2022, 9:37 PM

Got it. I could see how trying to do more fancy introspection with

python_tests

based on

python_test

targets already written out would be difficult and cause a lot of annoying behavior. This seems like a good middle ground.

high-yak-85899

01/27/2022, 9:38 PM

Yeah, we're in desperate need of the build caching. Our build system has been getting hammered having to rerun heavy

pip install

tasks followed by ~1000 unit tests every time.

high-yak-85899

01/27/2022, 9:39 PM

Thanks for confirming my suspicions/ideas!

hundreds-father-404

01/27/2022, 9:39 PM

Oh!! https://github.com/pantsbuild/pants/pull/14049 will make you happy 🙂 We will hopefully very soon infer the

files

. And that inference will happen at a precise level, that

some_test.py

will infer it depends specifically on

data.json

but not

logo.png

for example. That inference won't be perfect, but hopefully could detect this case

👀 2

hundreds-father-404

01/27/2022, 9:42 PM

Btw, this thread captures very well the underlying motivation and philosophy for Pants v2! After 10 years with Pants v1, we realized a very strong tension between precise metadata vs. ergonomics & maintainability With Pants v1 and (still) Bazel, you have to declare all dependencies explicitly. Also there was no notion of generated targets.

python_tests

is the entire atomic thing, you can't break it down into more precise subparts. So, to get full precision and accuracy, you would need an explicit

python_tests

target per file, each with their own dependencies hardcoded Our mission was & still is to get you file-level precision without the boilerplate hell

❤️ 1

3 Views

Open in Slack

Previous Next