Hi Pants Team I ve spent some time toiling over our codebase Pants #general

Hi Pants Team! I've spent some time toiling over o...

lemon-finland-60184

07/18/2022, 10:37 PM

Hi Pants Team! I've spent some time toiling over our codebase's tests, trying to migrate to using Pants as the test runner. I'm running into some issues that I'm not sure how to debug and it's preventing us from fully adopting Pants for executing tests. We have a Python monorepo with: • 6K+ python files • 8K+ automated tests ◦ on CI running in xdist, they take ~6 min to run across 20 shards ◦ on CI running in Pants, with pex files for every test, it takes ~40+ min across 20 shards ▪︎ using

pants --pants-config-files=pants.ci.toml test :: --test-shard=${TEST_SHARD}

When I try to run tests on just the changed files, Pants seems to take 5-10 minutes building the dependencies for all the tests •

pants --pants-config-files=pants.ci.toml --changed-since=origin/master --changed-dependees=transitive test --test-shard=${TEST_SHARD}

When running in CI, we have

pantsd

and

watch_filesystem

set to false from guidance in this thread, so I know we're taking some performance hit, but it seems like a really significant hit compared to using just xdist. I'm not sure what we're doing wrong, but if you have any guidance on how to determine what's causing the long run times, that would be appreciated? Thanks! (cc @cold-soccer-63228)

witty-crayon-22786

07/18/2022, 11:02 PM

hey! sorry for the trouble.

👋 1

witty-crayon-22786

07/18/2022, 11:02 PM

what persistent caches are you using? see https://www.pantsbuild.org/docs/using-pants-in-ci for examples

witty-crayon-22786

07/18/2022, 11:04 PM

while caches matter for skipping test runs and thirdparty artifact builds, they’re also important in larger repos to avoid paying the cost of re-extracting dependencies for each CI run

hundreds-father-404

07/18/2022, 11:05 PM

Also are you using lockfiles? If so, generated from Pex or Poetry?

witty-crayon-22786

07/18/2022, 11:14 PM

two more questions: 1. which pants version? 2. did you use

tailor

, or did you create BUILD files and targets manually?

lemon-finland-60184

07/18/2022, 11:16 PM

Hmm, we haven't set up caching actually 🤦, doing that now. • lockfiles - Yep, we're using them from Pex. • version - 2.13.0a0 • We're using

tailor

and then hand editing any BUILD files that are missing some deps (like fixture files)

👍 1

witty-crayon-22786

07/18/2022, 11:22 PM

sounds good. yea, caching can be a tricky subject, but it’s fairly important when using Pants. can compare/contrast the difference between a few runs of

./pants --no-pantsd dependencies ::

locally and a run of

./pants --no-pantsd --no-local-cache dependencies ::

… the latter will have to run a few thousand tiny processes.

lemon-finland-60184

07/18/2022, 11:23 PM

Got it. Let me play around with the caching tonight / in the AM and report back with any findings. 🤞

👍 1

witty-crayon-22786

07/18/2022, 11:24 PM

one more question: do you use any pytest fixtures which are shared between files?

lemon-finland-60184

07/18/2022, 11:25 PM

We used to... I removed them all recently to try and reduce those as dependencies.

👍 1

lemon-finland-60184

07/19/2022, 8:23 PM

Hey there, Stu and Eric - I got the caching working in our Gitlab CI pipeline, which helped reduced the dependency mapping time from about 10 minutes to 1.5-2 minutes depending on the run, which is great. E.g. logs:

Copy code

$ ./pants --pants-config-files=pants.ci.toml --changed-since=origin/master --changed-dependees=transitive test --test-shard=${TEST_SHARD}
19:56:01.10 [INFO] Long running tasks:
  78.38s	Map all targets to their dependees

Copy code

$ ./pants --pants-config-files=pants.ci.toml --changed-since=origin/master --changed-dependees=transitive test --test-shard=${TEST_SHARD}
20:11:53.26 [INFO] Long running tasks:
  73.10s	Map all targets to their dependees
20:12:23.48 [INFO] Long running tasks:
  103.34s	Map all targets to their dependees
20:12:53.81 [INFO] Long running tasks:
  133.68s	Map all targets to their dependees

👍 1

witty-crayon-22786

07/19/2022, 8:25 PM

Nice. A few performance fixes went into

a1

, so you should see a small bump there.

lemon-finland-60184

07/19/2022, 8:25 PM

Cool, will have to upgrade to that next.

witty-crayon-22786

07/19/2022, 8:26 PM

After upgrading, if you have time to record a

py-spy

profile of one of your

--changed

runs, that would be helpful.

❤️ 1

witty-crayon-22786

07/19/2022, 8:26 PM

Will send a link shortly

witty-crayon-22786

07/19/2022, 8:31 PM

https://pantsbuild.slack.com/archives/C0D7TNJHL/p1657730280463309

👍 1

lemon-finland-60184

07/19/2022, 8:31 PM

I do have a couple things I hope you can provide some guidance on still: • I'm seeing really long test times, on the order of minutes for a test file, not sure if it's a resource issue? We're provisioning a lot of CPU and memory for these tests right now to try and combat it, but doesn't seem to help... • Because we're sharding our tests, are there any "gotchas" with caching in CI?

witty-crayon-22786

07/19/2022, 8:32 PM

I’m seeing really long test times, on the order of minutes for a test file, not sure if it’s a resource issue?

are you able to reproduce this in isolation on a single test? can repeatedly re-run a test with

test --force

to see how long it takes in a steady state

witty-crayon-22786

07/19/2022, 8:34 PM

• Because we’re sharding our tests, are there any “gotchas” with caching in CI?

no: only that dependency computation will be repeated per-shard, which is why it is important to cache (the process executions)… but also for us (the Pants project) to optimize the heck out of the portion that cannot be cached. we know of one issue there that is on the 2.14.x docket, but which probably can’t be backported to 2.13.x

🙏 2

lemon-finland-60184

07/19/2022, 9:07 PM

can repeatedly re-run a test with
test --force
to see how long it takes in a steady state

Okay, so after a bunch of test runs, the baseline for one of my test files is consistently 60-70s to run all the tests in it. Whereas, in CI, it's much longer for the same file (300-400s the last couple runs), so leads me to believe it's some resource contention issue with how many tests we're running in parallel.

👍 1

hundreds-father-404

07/19/2022, 9:10 PM

Very plausible, indeed! Check out https://www.pantsbuild.org/docs/using-pants-in-ci#tuning-resource-consumption-advanced

👍 1

witty-crayon-22786

07/21/2022, 12:04 AM

A few performance fixes went into
a1
, so you should see a small bump there.

i lied: these had not actually been released yet. but they’ll go into

2.13.0rc0

tonight.

cold-soccer-63228

07/21/2022, 12:21 AM

Maybe a silly question, but how does parallelization work in Pants? Is parallelization done at the test level (e.g. test functions) or at the file level (e.g. a single Python file containing tests)? If I run tests via Pants on a single huge file with a bunch of tests, will parallelization happen?

witty-crayon-22786

07/21/2022, 12:22 AM

See https://pantsbuild.slack.com/archives/C046T6T9U/p1658351370305729?thread_ts=1658351370.305729&cid=C046T6T9U

🙏 1

witty-crayon-22786

07/21/2022, 12:29 AM

I don't think that that issue would be that hard to fix, but unless you have long-tail/running processes, I'm not sure how much benefit you'll really get. As long as there are tests to run in parallel, the fact that each of them is not sub-file parallelized won't matter much

3 Views

Open in Slack

Previous Next