I have a few unit tests in my Python repo that use...
# general
n
I have a few unit tests in my Python repo that use a lot of RAM (they run some big ML models). If I run
pants test
they sometimes get run in parallel causing an OOM and then random processes on my machine get killed. Is there a way to do any of the following (in order of preference): • Test
pants
that if RAM usage is > X it should stop launching new tests until other tests have finished. • Mark individual tests as needing to be run alone without any other tests running in parallel. • Limit the test parallelism.
PANTS_TEST_BATCH_SIZE
seems like maybe it'd do that but I see more tests running than I've specified as the batch size.
l
• Mark individual tests as needing to be run alone without any other tests running in parallel.
In my experience w/ both pants & bazel this seems like the safest bet - tagging tests as
heavy
/`high-memory` etc & then excluding them in your regular pants test invocation, ie
pants --tag=-high-memory test ::
, and then serially executing the high-mem is reasonable
n
Thanks for the suggestion!
then serially executing the high-mem
Does that mean I have to remember which tests were marked high-mem and then manually invoke
pants test
for each one, or is there a way to say "run all tests with this tag without any parallelism"?
l
I'm not sure if there's a pants-native way to do that, but my naive approach would be:
Copy code
#!/usr/bin/env bash

TARGETS=`pants --filter-target-type=python_test --tag=high-memory list ::`

for TARGET in $TARGETS;
do
    pants test "$TARGET"
done
Not totally ideal, but simple enough for starters if it's baked into your CI checks
n
yeah, that works pretty well. Thanks!! I'll give that a shot.
b
https://www.pantsbuild.org/stable/reference/global-options#process_execution_local_parallelism might be handy too
pants --process-execution-local-parallelism=1 test --tag=high-memory ::
or similar
(downside of this approach is that'll run any required set-up like codegen or getting requirements serially too, unlike the
for
loop approach)
e
for convenience, add
Copy code
[cli.alias]
--no-parallel = "--process-execution-local-parallelism=1"
to your
pants.toml
, and then run your tests with
pants --no-parallel test --tag=high-memory
n
Nice! Both of those suggestions are helpful. I didn't know about the
[cli.alias]
section; that's cool!
👍 1