sparse-lifeguard-95737
10/04/2022, 4:10 PMpython_test
in the same batch must have identical:
• batch_id
• resolve
• extra_env_vars
• xdist_concurrency
• environmentbatch_id
but different values in one of the other fields. the options are:
• error out with a message telling the user to set different batch IDs where <conflicting field>
is different
• create sub-batches under-the-hood to get groupings with identical values for those fields__defaults__
setting a batch_id
near the root of your code to say “batch everything possible”, and then you set a custom extra_env_vars
on one of your tests, it feels like noise/toil to also have to set a separate batch_id
on that target (or directory of targets)bitter-ability-32190
10/04/2022, 4:34 PMhundreds-father-404
10/04/2022, 4:44 PMand could have issues if it's important to the user that the tests must all run togetherRelated, I remain skeptical of Pants splitting up partitions into batches
bitter-ability-32190
10/04/2022, 4:45 PMRelated, I remain skeptical of Pants splitting up partitions into batchesOof. Can you elaborate 🙂
hundreds-father-404
10/04/2022, 4:48 PMsparse-lifeguard-95737
10/04/2022, 4:54 PMIf people want smaller partitions, then they should use more batch IDsIMO this will be a bad UX - I would find this incredibly frustrating to maintain as a user in our repo I think it would be much better if there was an enforced
batch_size
that could be tweaked as a config toggle. I agree it might be tough for the generic test
logic to implement that reliably, so I’m currently adding it to the pytest
subsystem.batch_id
in a __defaults__
at the root of the repo, and only need to specify other IDs for “weird” individual cases / subfolders lower in the repohappy-kitchen-89482
10/04/2022, 5:33 PMtest behavior can change depending on what other tests are run together.The whole endeavor of batching tests implicitly assumes this is not the case for your tests…
bitter-ability-32190
10/04/2022, 5:34 PMhundreds-father-404
10/04/2022, 5:35 PMTry running ./pants test :: to see if any tests fail. Sometimes, your tests will fail with Pants even if they pass with your normal setup because tests are more isolated than when running Pytest/unittest directly:
• Test files are isolated from each other. If your tests depended on running in a certain order, they may now fail. This requires rewriting your tests to remove the shared global state.Right now, we force you to do one process per file. This new feature allows you to instead back out of that, and even run all tests in a single process, like when you use Pytest directly w/o Pants
happy-kitchen-89482
10/04/2022, 5:36 PMSometimes, your tests will fail with Pants even if they pass with your normal setupOr vice versa! We are not necessarily stricter, just different.
witty-crayon-22786
10/04/2022, 5:51 PMWe have no strong reason to assume that our current “one process per file” strategy is more correct, particularly when that’s not how tests are typically run outside of Pantsit absolutely is in the context of caching… or, at least: you need to be consistent about whether you’re running a small set or a large set in the presence of caching. it sounds like @sparse-lifeguard-95737’s use case may not require batching within the partitions (yet?), so it feels like a logical step not to implement it in a first version
happy-kitchen-89482
10/04/2022, 5:54 PMsparse-lifeguard-95737
10/04/2022, 5:57 PMbatch_id
but conflict in some other field, do we error out? Or auto-resolve the conflict by putting them in different batches under-the-hood despite the batch_id
?
2. If N
targets have the same batch_id
, do we always run all N
in the same batch? or do we allow for a global size-limit toggle so we end up with M
sub-batches of `N`/`M` ?
1 was my original question in the thread. I’ve been assuming that the answer to 2 is “yes” but it seems there are differing opinions 🙂hundreds-father-404
10/04/2022, 5:58 PMsparse-lifeguard-95737
10/04/2022, 6:02 PMbatch_id
that better conveys “these are safe to run together if possible” vs. “these are definitely going to run together”hundreds-father-404
10/04/2022, 6:05 PMbatch_can_be_subpartioned: bool
field?bitter-ability-32190
10/04/2022, 6:06 PMwitty-crayon-22786
10/04/2022, 6:09 PMRelatedly we want the opposite field for fmt for Terraform. E.g. “don’t you dare split this up”
But that’s not a users field, but a Partition onethat isn’t really a field though: that’s semantics that the rule implementation needs to apply
bitter-ability-32190
10/04/2022, 6:12 PMsparse-lifeguard-95737
10/04/2022, 6:17 PMsomething likeI was thinking an alternate name forfield?batch_can_be_subpartioned: bool
batch_id
instead of a second field. otherwise you have to enforce that everything with the same batch_id
has the same batch_can_be_subpartitioned
value…witty-crayon-22786
10/04/2022, 6:20 PMbitter-ability-32190
10/04/2022, 6:22 PMbatch_id
the best term?
Given `fmt`/`lint` have batch_size
, if test
follows the field would technically be called batch_can_be_batched
😬sparse-lifeguard-95737
10/04/2022, 6:24 PMIf set, tests in files covered by this target _may_ be tested in the same `pytest` process as other files with the same `<field-name>` value.
batch_id
is not the best at conveying this and am open to alternate suggestions 🙂hundreds-father-404
10/04/2022, 6:25 PMgroup_id
? lolbitter-ability-32190
10/04/2022, 6:26 PMhundreds-father-404
10/04/2022, 6:27 PMclump_id
happy-kitchen-89482
10/04/2022, 6:37 PMsparse-lifeguard-95737
10/04/2022, 6:38 PM_id
piece of it than the batch_
piece - giving something an “ID” feels like you are definitively marking the final groupingwitty-crayon-22786
10/04/2022, 6:39 PMing
might suggest that you might not get the batch verbatim: batching_id
batching_key
sparse-lifeguard-95737
10/04/2022, 6:39 PMbatching_tag
witty-crayon-22786
10/04/2022, 6:39 PMsparse-lifeguard-95737
10/04/2022, 6:39 PMbitter-ability-32190
10/04/2022, 6:40 PMgroup_by
. SO group_tag
?group_key
?group_can_be_subpartitioned
or group_can_be_batched
. Honestly bummer the didn't name batch_size
something else 🙂sparse-lifeguard-95737
10/04/2022, 6:45 PMbatch_size
toggle that users can play with to make their own trade-offs (similar to how lint
and fmt
currently work)batch_size=<big number>
. but that way Pants has some built-in guardrails against batches so huge they clog up / break some piece of the systembitter-ability-32190
10/04/2022, 6:54 PMsparse-lifeguard-95737
10/04/2022, 7:24 PMcompatibility_tag
? With help text something like “assign python_test
targets the same compatibility_tag
to show they are safe to be tested in the same pytest
process. Pants will attempt to run compatible tests together as long as insert description of caveats _here_”bitter-ability-32190
10/04/2022, 7:26 PMhappy-kitchen-89482
10/04/2022, 7:46 PMcompatibility
as the conceptmutuality
- which is made up, but therefore has the advantage that we can make it mean whatever we want