Is it possible to control the degree of parallelis...
# general
r
Is it possible to control the degree of parallelism that
test
runs with (ideally to 1, since I have problems with tests running in parallel that I will solve in the medium-term, but not in the short-term)
f
--process-execution-local-parallelism
might be useful
r
Ah ok, I've found that
./pants test --debug ::
also works, but doesn't process the same summary output that
./pants test ::
does. I'll give this a go
Yep,
./pants --process-execution-local-parallelism=1 test
gives me the nice summary output
Ok, I've added a documentation suggestion for this as I think its handy to know about when you are trying to adopt Pants in a new codebase where the test (i.e. integration tests) may not be friendly for being run in parallel.
maybe the ability to add metadata to a test file specifying whether it can be run in parallel with other test files would be handy?
đź‘Ť 1
h
Interesting idea. Or some sort of parallelism label, where tests with the same label cannot run concurrently with each other (so a label could represent a database or some shared resource)
đź’Ż 1
Rather than a boolean, which may be too restrictive
r
so I should be able to do this with tags and multiple executions of
pants test
with appropriate combinations of the
--tag
filter and the
--process-execution-local-parallelism=1
parameter?
I've tried this, in so far as attempting to exclude the tests that are causing me problems when run in parallel for the moment...
in the directory with the troublesome tests I have a build file like this...
Copy code
python_tests(
  name="tests",
  overrides={
    "test_a.py": {"tags": ["run_serially"]),
    "test_b.py": {"tags": ["run_serially"]),

    # And the above can be expressed as?...
    ("test_a.py", "test_b.py"): {
      "tags": ["run_serially"]
    },
  }
)
and then run
./pants --tag='-run_serially' test ::
but I still see
test_a.py
and
test_b.py
running and therefore blocking completion due to their race condition
hmm, for fun I changed to
--tag='+run_serially'
and being told no files or targets specified, so maybe I'm not applying the tag correctly
ok not sure what happened, there... maybe I mistyped the tag for the
--tag='run_serially'
example I'm finding that I can
./pants --tag='sometag' test ::
to include only tests WITH the tag, but
./pants --tag='-sometag' test ::
to EXCLUDE tests with the tag is not working, and there are still being included.
ah, I've tripped over this I think https://github.com/pantsbuild/pants/issues/11123
maybe? I'm invoking with
::
which is an address spec? not a file address? 🤷
Looks like
+tag
and
-tag
work for
list
but only
+tag
works for
test
h
Sorry for the issues. Tag-based selection happens outside of any specific goals so I would have expected it to work the same for
list
and for
test
. Weird.
Let me see if I can reproduce
There are right-parens instead of right-braces in the BUILD file snippet above, but I assume that's not the issue ?
OK, I can reproduce this, so that's good
🙌 1
I think you've found a proper bug!
Am digging further
r
Correct, the right-parens vs right-braces issue was just a typing error in slack
h
@witty-crayon-22786 thoughts on this?
r
The workaround for this in the short term… having two
python_tests
targets in the relevant directory and including/excluding via
sources
directly?
w
commented on the ticket: you can adjust your selection code slightly
@rapid-exabyte-76685: also, i accepted your docs update: thanks! i added a bit more, because there is another facility for this: https://www.pantsbuild.org/v2.11/docs/troubleshooting#controlling-test-parallelism
âś… 1
@rapid-exabyte-76685: does the
execution_slot_var
setting make sense for your usecase? we haven’t done a great job of explaining it
r
If I understand correctly... if there are 4 cores in the machine, and
pants.toml
has...
Copy code
[pytest]
PANTS_PYTEST_EXECUTION_SLOT_VAR = "MY_SLOT_VAR"
... then the test code could do ...
Copy code
os.environ['MY_SLOT_VAR']
and it would get a value between 0-3 (or 1-4?) depending on which core/slot it is running in?
w
correct.
then you can have initialization code in your tests assume that it “owns” the slot on localhost, and safely
drop $database
(or the non-SQL equivalent) and recreate
r
In my CI, can I ask pants what the range of values that it will be returning is? e.g. if I wanted to run some setup process to create
database_0
through
database_3
- where the setup is running outside of pants
w
you can by querying the computed value of
--process-execution-local-parallelism
, yea. but if it’s possible to run the setup inside tests, that might be more self contained
you can by querying the computed value of
--process-execution-local-parallelism
which you can do with something like
./pants help-all | jq …
âś… 1
r
You would need to combine this with filtering tests by tag or target I think, to say: run all of these tests that use this resource at the same time, but they get their own copy of the resource
w
i might be misunderstanding, but i don’t think so…? if you ran across all of your tests, tests which didn’t need “the resource” would run fine. they would just tie up a slot (such that that resource slot wasn’t used while they were running)
but i suppose that it depends on the breakdown between “tests that need the resource” and “tests that don’t need the resource”, and the cost of having the resource be idle
r
Yep, now that I've thought about this closer, you are correct
which you can do with something like
./pants help-all | jq …
I think this is what I want?
Copy code
./pants help-all | jq -r  '.scope_to_help_info ."" .advanced[] | select (.env_var == "PANTS_PROCESS_EXECUTION_LOCAL_PARALLELISM") .value_history .ranked_values[] | select (.rank == "HARDCODED") .value'
Another one for the recipe book? https://github.com/pantsbuild/pants/issues/14969 (link to message above added as a comment in this issue)
That I'm plucking something out called
HARDCODED
gives me pause but it looks to return the correct core count across two different machines... although on my M1 Pro MacBook, which has 8 performance cores and 2 efficiency cores, it returns 10.
My thinking here being... do you want to run build processes on efficiency cores or restrict them, if possible, to performance cores only?
w
that value is computed via https://github.com/pantsbuild/pants/blob/0b5a5c514a790450965028f457956f7dba09b0c7/src/python/pants/util/osutil.py#L16-L29 : if there is a better default, we’d be happy to hear about it. but i expect that the efficiency cores are the first to be used, rather than the last…?
e
We use
python_tests
overrides
to apply tags and use the following in CI to run just the select tests. Same could work with
xargs -0
. Critical part that I don't see mentioned was the
--granularity=file
arg on filter, which without will end up including the entire expansion of
python_tests
in that directory.
Copy code
readarray -d '' targets < <(./pants \
        --tag=-skip_ci \
        --changed-dependees=transitive \
        --changed-since="$PANTS_CHANGED_SINCE_REF" \
        filter \
        --sep="\0" \
        --granularity=file)
./pants test "${targets[@]}"
One gotcha with the --granularity=file is that docker images will no longer be matched, so be careful if you use the same codepath for calculating targets