<@U06A03HV1> <@U051221NF> (and anyone else interes...
# development
s
@witty-crayon-22786 @happy-kitchen-89482 (and anyone else interested) here is the proposal for batching
pytest
calls you asked for on Friday: https://docs.google.com/document/d/1U0Q43bRod_EeVP4eQpcN36NMlxZ4CpRzTl2gMQ5HHvg/edit?usp=sharing
šŸ¤˜ 2
šŸ™ 2
w
thanks! will take a look before EOD
looks good: added a few bits of feedback. when youā€™re happy with it, let me know and i can sketch out a suggested implementation somewhere
s
thanks! responded with some questions - also added a short blurb at the bottom about interactions w/ sharding
h
@sparse-lifeguard-95737 alternative idea: a new target type called
python_test_batch
with a
sources
field. That avoids all the awkwardness with merging
extra_env_vars
, summing
timeout
I think that's pretty easy to implement also
s
@hundreds-father-404 the batch would need to span more than 1 dirā€™s worth of tests
h
regardless of syntax, is it important that you can still only run a single file within the batch? Or it's fine that the batch is run as one atomic unit?
s
running a single file within the batch is still important
h
the batch would need to span more than 1 dirā€™s worth of tests
That's fine,
sources
works with subdirs too, including
**/*_test.py
s
yeah but itā€™s not what
./pants tailor
will set up for you, AIUI - and not the convention we typically tell people to adopt šŸ™‚
unless I have a misconception in my head
h
yeah but itā€™s not what ./pants tailor will set up for you, AIUI - and not the convention we typically tell people to adopt
Yeah, we would still be defaulting to 1 target per file because we can't infer intent of what you want your batches to be. But you can manually create a
python_test_batch
target instead, which will cause tailor to not try to add
python_tests
for those "owned" sources
s
itā€™s also important that
--changed-since
only chooses the actually-changed files within the batch for execution (vs. returning the entire batch when any one has changed)
you can - the migration is more painful tho. compare the flow of:
Copy code
1. `./pants tailor` creates a bunch of `python_test` targets
2. You have to manually delete all of those and replace them with a top-level `python_test_batch`
vs.
Copy code
1. `./pants tailor` creates a bunch of `python_test` targets
2. You manually add a `__defaults__` at a top-level `BUILD` file, but leave the existing targets alone
šŸ‘ 1
h
itā€™s also important that --changed-since only chooses the actually-changed files within the batch for execution (vs. returning the entire batch when any one has changed)
Yeah, that's the biggest downside of
python_batch_target
imo. We used to have code to for
SpecsWithOrigin
, which meant we preserved the specific files users specified to us. That code got deleted when we added file-level targets,
python_test
vs
python_tests
We would likely to need to add back something like that, which is a bummer
s
another issue (unless Iā€™m missing something) of
python_test_batch
is dealing with hierarchy. if you have
python_test_batch(sources=["**/test_*.py"])
at the ~root of your repo, but then you end up wanting to exclude specific subtrees / specific files, you have to both edit the root
sources
list to exclude files and add new targest. vs.
__defaults__
which (I think/hope) will Just Work with hierarchical settings
i.e. a root
__defaults__
to set the default batch ID, then a
__defaults__
in subdirs that need different batches
ā€¦ I guess I should make sure thatā€™s actually how
__defaults__
works
h
at the ~root of your repo, but then you end up wanting to exclude specific subtrees / specific files, you have to both edit the root sources list to exclude files and add new targest.
Agreed. This right here is precisely why we added
overrides
šŸ™‚ it used to be the case that you either changed metadata for every generated target, or had to create a new target and modify the existing one to exclude those old sources
yep,
__defaults__
overrides things from ancestor dirs. and then targets can override from there
> itā€™s also important that --changed-since only chooses the actually-changed files within the batch for execution (vs. returning the entire batch when any one has changed)
We used to have code to for SpecsWithOrigin,
ohhhh wait I don't think we need that. We have
SpecsPaths
, which preserves the literal file paths the user specified. I think we could leverage that to filter out which files in the batch target should be used I can demo this technique for Go, which currently suffers from the same issue. https://github.com/pantsbuild/pants/issues/16945
Presumably, tailor would still use python_testsā€”although, it would help onboarding if it used python_test_batch instead.
This is an interesting thought experiment. If our focus is on getting things working asap for new Pants users, it would be desirable to use batching from the start with
tailor
. And then over time possibly optimize to not-batched
w
i still view batching as a tradeoff, so i donā€™t really think it should be a default behavior. cache keys become larger (and there isnā€™t really a safe path to fixing that). so i donā€™t think that weā€™d want to do it by default.
as to an explicit ā€œbatchā€ target: i think that that would make it less clear that all files with a particular
batch_id
might still be further subdivided into batches: labeling files might make it clearer that this is about which files can be batched, rather than about which will definitely be batched
h
Yeah, I think you opt in to batching, and I think a batch target is overkill right now
s
Read through the latest comments on the doc - sounds to me like
batch_id
+
__defaults__
is the approach. I am eager to get started šŸ™‚ Iā€™ll poke around in the code to get some ideas on the impl - @witty-crayon-22786 if you already have some thoughts, can you send them my way?
h
Yeah, I think batch_id is the move. Wiring up that field should be pretty easy with the Target API, and is a good first step One prefactor you could probably do is rewrite
pytest_runner.py
to have new types and a new rule that generifies running Pytest. Whereas right now the logic lives inside the
TestFieldSet
rule, instead create this new rule that takes args like the file names to specify. It will roughly be a copy-and-paste of the body of the current rule. Then, the
TestFieldSet
rule will call it
šŸ¤” 1
Ohhhh @witty-crayon-22786 idea to get the actual batching working. It's tricky that
test.py
is one-field-set to one rule invocation. Ideally we don't need to change
core/goals/test.py
here. What we can do is leverage that the engine deduplicates identical Gets and only runs one of them. If we have 5 individual FieldSets all making the same Process request, then the engine will only run it once and share amongst everything. More precisely, they will be making the same
Get
call for your helper rule in the prefactor
I still don't know how to handle `test`'s sharding. I think batching the batch_id should work with the above strategy though šŸ™‚
h
How does this interact with lockfiles? I assume batches also cannot span multiple lockfiles?
s
agreed (Iā€™ve been saying ā€œresolveā€ internally instead of lockfile - are those effectively the same thing?)
there is 1 open question in the doc about whether
batch_id
will be a
python_test
specific thing or a generic field for all
*_test
targets (with some mechanism to exclude it for things like
shunit2_test
where it doesnā€™t make sense)
maybe we should answer that question before I move forward as itā€™ll probably drastically affect the implā€¦
h
Yep, resolve==lockfile
s
even if
batch_id
is generic, I think weā€™ll need a per-target batching strategy - for example
xdist_concurrency
will need to be the same for all tests in a pytest batch, but that field doesnā€™t exist for other test types
h
Itā€™s easy to make it generic later, I think, so maybe start with just python_test?
Iā€™m not sure if batching makes sense in all languages
šŸ‘ 2
It can be a mixin I guess
so itā€™s easy to apply to other target types in the future
h
I recommend starting only with Python
s
python-specific was also my thought but Stu raised the idea of making it generic so Iā€™ll let him weigh in šŸ™‚
h
I suspect trying to implement this change via
core/goals/test.py
will be a lot harder to pull off, vs the strategy I outlined above. But maybe necessary depending on what we want to do with test sharding Although I suspect it will be possible to only change
core/goals/test.py
to consider batch IDs when sharding, without needing to change anything else in the file. And then all the rest of the changes are in
pytest_runner.py
How does this interact with lockfiles? I assume batches also cannot span multiple lockfiles?
@sparse-lifeguard-95737 eager error if this is violated? I think that probably makes more sense than silently ignoring the batch and un-batching them
s
yeah I am wondering how generic the ā€œplumbingā€ should be, even if we only end up adding the
batch_id
field to Python for now. high-level options that occur to me: 1. change all the existing types in the existing
test
flow to use batches. subsystems that donā€™t use batching will just use single-element collections everywhere 2. add a parallel type tree + goals for batched test requests / responses. subsystems register rules for either the single-file or batched types depending on their capabilities. the top-level
core/goals/test.py
sends requests for both single-element and batched field sets 3. isolate the changes to the
python
subsystem, somehow reconstruct batches from the single-element requests currently send by
core/goals/test.py
I am not sure if 3 is feasible (were you saying you have an idea for how it could work @hundreds-father-404?). 1 is nice in theory but potentially a pain for subsystem developers if it turns out most things arenā€™t batchable. 2 is complex but (hopefully?) makes it easy for subsystems to only implement what they need
h
yeah, I'm pretty confident #3 is possible thanks to the engine's design of deduplicating identical requests: https://pantsbuild.slack.com/archives/C0D7TNJHL/p1663863514744679?thread_ts=1663700934.944129&amp;cid=C0D7TNJHL
w
@witty-crayon-22786 if you already have some thoughts, can you send them my way?
will do by EOD: thanks!
h
Dan and I DMed more about approach #3, and he correctly pointed out it won't work because an individual TestFieldSet doesn't know about any other TestFieldSets in play, so it wouldn't know which other files to run with Given that, I suspect that a mix of #1 and #2 is going to be cleanest. We have a lot of infrastructure around FieldSet to do things like discover what to run on. Given that you're still using the
python_test
target for batched runs, those targets are going to match against the current
PythonTestFieldSet
we have, but we need to run those with the batched runner and not individaul runner I think the rough implementation would look like: 1. Keep our current PythonTestFieldSet and the
TargetRootsToFieldSets
line in
test.py
2. But then filter the results into two sets: the individual runner vs batch runner 3. call the two distinct APIs,
Get(TestResult, TestFieldSet)
and
Get(TestResult, TestBatch)
(or return
TestBatchResult
maybe)
w
hm, possibly. i donā€™t like the idea of two distinct test APIsā€¦ would really be preferable to have a single API which supports multiple FieldSets which have already been declaratively declared compatible because they have the same x/y/z fields
not to make things overly complex, but: this ā€œpartitioning by fieldsā€ is related to what @bitter-ability-32190 has been doing in
lint
/
fmt
: iā€™d love for it to be declarative.
šŸ‘€ 1
s
itā€™s all connected šŸ•øļø
b
@sparse-lifeguard-95737 LMK if you wan the TL;DR of my changes
s
yes please!
b
I'll make a new šŸ§µ
šŸ‘ 2
w
meanwhile: iā€™m going to braindump the computed Fields idea somewhere, because i think it has legs.
šŸ‘– 1
declarative field/metadata partitioning: https://github.com/pantsbuild/pants/issues/16967
will do by EOD: thanks!
correction: Eric will do this on whichever timeline you agree to! thanks.