Tangentially related is there a good reason format process p Pants #development

Tangentially related, is there a good reason forma...

bitter-ability-32190

12/31/2021, 2:00 AM

Tangentially related, is there a good reason format-process-per-file isn't the default? Since formatting doesn't care about dependencies it should almost always be the faster choice on a multi core machine

hundreds-father-404

12/31/2021, 2:01 AM

Yes, it was super slow when I benchmarked in 2020. Although that was before excellent performance improvements John made to get rid of overhead with PEX. So it'd be worth re-benchmarking! https://docs.google.com/document/d/1Tdof6jx9aVaOGeIQeI9Gn-8x7nd6LfjnJ0QrbLFBbgc/edit#heading=h.oyst624c2sph

hundreds-father-404

12/31/2021, 2:02 AM

Another issue is that the UX of

--per-file-caching

is obnoxious on cold runs, soooo much verbose output

bitter-ability-32190

12/31/2021, 2:03 AM

That's actually why my default level is warning 🤣

bitter-ability-32190

12/31/2021, 2:04 AM

But that messes with other systems (like stats log and show target labels)

hundreds-father-404

12/31/2021, 2:04 AM

Ah, how about using --log-levels-per-target for that? (Thanks again for your fix!!)

hundreds-father-404

12/31/2021, 2:04 AM

I guess we need to finish releasing that fix, oops. We keep slipping in more fixes

bitter-ability-32190

12/31/2021, 2:04 AM

I guess I have it inverted. Warning by default and whitelist info

bitter-ability-32190

12/31/2021, 2:05 AM

But yeah I'm using log levels per target for the whitelisting

happy-kitchen-89482

01/01/2022, 5:41 AM

Another area where that obnoxious UI, plus performance, could be improved by "process grouping" - dep inference

👆 1

bitter-ability-32190

01/01/2022, 1:34 PM

I think the prodash UI cleans things up nicely, FWIW

bitter-ability-32190

01/01/2022, 1:37 PM

And yeah I wonder if there's another possible partitioning scheme of (pick a magic constant) 1.5-2-ish processes-per-core total (running on the unlinted) The multiplier ensures one process doesn't become a tentpole. And over a sufficient number of runs, it acts much like per-file process, as the cache fills up with chunks of files

bitter-ability-32190

01/01/2022, 1:38 PM

And doesn't require synthetic results

bitter-ability-32190

01/05/2022, 3:00 PM

So coming back to this, and on topic... In the cold case, doing

fmt-process-per-file

is always faster. In the hot case, doing

fmt-process-per-file

is actually slower, with the results increasing in time as the # of files go up (which makes sense)

bitter-ability-32190

01/05/2022, 3:57 PM

Screenshot from 2022-01-05 09-56-47.png

bitter-ability-32190

01/05/2022, 4:04 PM

As a baseline, running

yapf

and

isort

directly on the repo (with parellelization) takes roughly 22 seconds

bitter-ability-32190

01/05/2022, 4:29 PM

So I guess the takeaway here is if you're using

--changed-since

it might not matter that much (until batching or similar is implemented). Probably go with all-in-one. Otherwise, if you're running on the world (perhaps in CI) then process-per-file is a good idea

Open in Slack

Previous Next