Is there a way to limit the number of tests that k...
# general
g
Is there a way to limit the number of tests that kick off in parallel but doesn't limit other things like parallelization of building pex files? I used --process-execution-local-parallelism thinking it would limit just tests, but it's limiting everything
w
Take a search through slack history for stuff like "serialize" "concurrent" and pytest - I know variants of this question have been covered for years, but I don't recall the specifics as each question had its own spin https://pantsbuild.slack.com/archives/C046T6T9U/p1658910657563539
g
Yeah I'm already using that trick.
My issue is that the resources required for the specific spark test fixture is quite large. We have 32 core machines, but want to limit that particular fixture from not being spun up more than 8x concurrently on a single host.
I can achieve that setting --process-execution-local-parallelism=8, but then I completely avoid using more than 8 cores for all of pants. (the things pants spins up is using more, but pants gets limited)
w
There might be a setting for testing specifically, or pytest even more specifically using some of those pytest-xdist, or execution slots, or whatever, but does it work well enough if you split into two steps?
pants mygoal :: && pants test --process-execution...
g
well, the thing in particular that I don't want throttle is the creation of all of the pex for the test execution runners.
can I create those without running tests and then run tests?
w
Are your tests run on pexes?
g
yes, it's just using the out of the box experience for pants test. Which seems to be building dedicated pex for each test run.
w
I guess what I was asking, are you running tests on the output of your
pex_binary
?
g
No, I'm letting pants handle it end-to-end.
I just call
pants test ::
I'm just using native functionality with python_sources/python_tests targets
w
Yep, okay, cool, that's what I thought - I just got confused there for a sec
👍 1
g
That is exactly what I'm using, but the issue is there is no limit to how many slots are given by pants, except for limiting it via --process-execution-local-parallelism
So pants spits out 32 slots (I don't know the literal number, just using it as an example) -- so great, now I have that and I could technically say, if you're slot 1-8, you can go forward, but if you're greater than 8 wait. but that feels wrong.
I'd even be willing to exit a non-zero from the fixture (when slot is above 8) and then just depend on pants cache for test results, but shoot that starts to feel hacky as crap and not super supportable.
w
Someone else might be able to speak better about the process granularity you're going for. Because you want to use all 32 cores for the setup of the test goal, but then jump that down to 8 for the running of the tests themselves? To me, that feels a bit custom or run-timey
g
exactly what I want.
Well, to be clear I still think that I was overly prescriptive 🤣 -- I don't actually care about the details, my goal is to limit the number of concurrent spark fixtures to 8. How I achieve that? I don't particular care, I'm an open book.
Isn't there a way in pants to say, you're node 1/3, 2/3, 3/3 so to divide tests amongst three hosts?
w
Sharding?
g
yes
Also worth checking out the CI, as there is some craziness there too https://github.com/pantsbuild/pants/blob/main/.github/workflows/test.yaml
g
coo.
For example, you can run three shards with
--shard=0/3
,
--shard=1/3
,
--shard=2/3
.
💯 1
I just read the github PR, apparently that's not a typo
w
Seems to match - 0-index
How are you using execution_slot_var right now? What's the problem you're running into? More precisely, are you running any of your own process logic?
g
I'm just using it so that I can spin up spark on a different port number so they don't conflict.
👍 1
w
I'm also guessing you've checked out whether pytest has any native fixture functionality that can help?
g
I haven't looked deeply into that yet.
This is what I'm doing
Copy code
# Base port for Spark UI
        base_port = 4040
        execution_slot = int(os.environ.get("PANTS_EXECUTION_SLOT", 0))
    
        # Check if the USER environment variable matches the expected pattern 'agent-N'
        user_name = os.environ.get("USER", "")
        match = re.match(r"agent-(\d+)", user_name)
        if match:
            agent_number = int(match.group(1))
            # Calculate port offset using both agent number and execution slot if pattern matches
            port_offset = ((agent_number - 1) * 100) + execution_slot
        else:
            # Use a simpler scheme if no match, to still avoid conflicts with multiple execution slots
            port_offset = execution_slot
    
        spark_ui_port = base_port + port_offset
w
I'm sure there is some sort of concurrency/batch math you could play with pytest and pytest-xdist to get something like this working. Me personally, this feels like the kinda thing I might try to do in code - kinda like you've described above.
g
I think you're right.
w
g
yes, playing with it all 🙂
I'm using small batches and grouping thing into batch_compatibility_tag
It's better for sure.
w
I've never used spark - so I'm speaking out of turn, but is there a substantial overall time benefit in unit tests when adding more instances (given the cost of startup?)
g
hell no, no benefit at all. That's why I'm batching.
s
Could you run two invocations of pants test? One for the spark tests with
process_execution_local_parallelism
and one for everything else?
g
Potentially with tags, but oof that would be a lot of work.
maybe I could use pants dependents for the fixture