Hello! - Wondering if I could get some general adv...
# general
b
Hello! • Wondering if I could get some general advice about untangling a pants monorepo with a single large resolve • This is my first time using pants and interacting with the community so please forgive any ignorance Problem • I've started to work on a 3 year old pants python v2.17.0 monorepo with only one resolve called:
main
• The
main
resolve is very bloated • When adding a new dependency or modifying source code that is a common dependency across many
python_sources
, we are seeing
pants test
run for 60min+ in github ci ◦ We've configured
pants test
to run with
--changed-dependents=transitive
in ci to test the full blast radius leaning on the side of caution Strategy • We could probably throw more money at ci but want to stop taking on tech debt and improve the design first • Splitting up the
main
resolve into many, smaller resolves with better written tests seems like an obvious strategy • In practice that amount of work is daunting enough (3 yrs of debt) that it is not practical to prioritize above other work for our small team, so we need to do this incrementally • If we use
parametrize
to incrementally share code through
dependencies
across resolves at the
python_source
level in BUILD files: ◦ The expensive unit tests are still triggered whenever we touch common BUILD files or need to modify common
python_sources
that are deep enough in the stack ◦ Unfortunately his happens often enough bc we have a large monorepo that is not practical until we finish streamlining things more • Our latest idea is to incrementally publish smaller resolves to an internal pypi as standalone
python_requirement
packages ◦ Once published to internal repo, we can link packaged
python_distribution
's upstream as
python_requirements
whenever we need to share code ◦ This strategy should buy us the time to incrementally refactor our tangled up code base by: ▪︎ Trading a small one-time cost of new pypi infra ▪︎ For slightly more complexity in decoupling and then managing our
python_source
deps outside of pants' automatic dep inference ◦ Once we take the time pressure off ▪︎ We can make a more careful reassessment of what deps should be inferred from the file system and which should not and adjust accordingly Question to you • Has anyone ever had to do something similar? • Does our latest strategy make sense? • Better yet are we missing something simple - maybe in a newer version of pants - that could help us solve the problem in a more elegant way?
1
1
h
Hi! Before I dive into each bullet point, one quick question: Are you concerned about CPU costs of CI, or elapsed time? (60m+). If the latter, have you tried sharding the tests so you can run them on multiple CI runners concurrently?
👋 1
b
@happy-kitchen-89482 Hello! Elapsed time bc waiting for the large amount of tests to run is starting to effect productivity. We use some limited concurrency with xdist but havent tried sharding, I will try this to see if it helps. A breakdown of one our worst case builds is outlined below. The process scales linearly for each python_test suite defined in the main resolve (150): • Build requirements.pex from main.lock for every python_test suite - 13 min • Build pytest_runner.pex for every python_test suite - 5 min • Run Pytest for every python_test suite - 40 mins ◦ All 150 individual test suites run in under 30 s • Other - 2 to 5 min
h
Hmm, each shard will still do the requirements.pex building, unless you have a remote cache set up? Then you can build that in one preparatory job and share it across the sharded test runners
b
Thanks! No remote cache set up atm, but also something we’ve considered. Felt like upgrading pants and configuring remote cache in ci would generally improve performance but was complimentary to also refactoring and streamlining the pants dag into multiple resolves.
h
Well, the thing is if lib B depends on lib A then they need compatible 3rd party requirements, which typically means being in the same resolve. So if all your code is interdependent, and you split resolves, you’d have to manually ensure that compatibility between resolves. Which is what all the lockfile subsetting done by Pants is for to begin with.
👍 1
The problem, I guess, is that just installing from the already-resolved lockfile takes like 13 minutes?
👍 1
b
The most tempting part of using dists and pip requirements is we can cut the unit tests part which is the most expensive part of ci and keeps growing with our code. Manually ensuring compatibility adds complexity but shouldnt be a challenge. We are already seeing similar pain in the main resolve. Splitting up and pinning deps through packages has other benefits helping us to refactor incrementally also - lean on pip.
h
Philosophically though, if B depends on A then whether you want to retest B when A changes shouldn’t really depend on whether B consumes A as first-party source or via a requirement?
b
That’s true - and maybe I’m being ignorant - but how could you define a python_source dependency in the same resolve between A and B and not have B tests run when A changes while also passing
—changed-dependents=transitive
? We like the flag as an additional safeguard to give very high confidence in our builds. Maybe we are abusing it?
h
You couldn’t. My point is that I think you wouldn’t want to? You are using the flag correctly, and the issue seems to be that it’s too slow? But it does seem like the correct thing to do. So let’s see about speeding it up…
👍 1
b
Makes sense and thank you for the advice! I am experimenting with sharding and pytest xdist / batching. Will update with findings.
h
With sharding and batching on the Pants side you may not need pytest xdist?
b
Update • sharding alone did the job for us • it linearly scaled the work across a pool saving 80% of build time for both a single and multi-resolve monorepo • it also ended up reducing our billed compute by 15% • thank you so much @happy-kitchen-89482!
h
Nice! Glad that worked out.
🍻 1