Tangentially related, is there a good reason forma...
# development
b
Tangentially related, is there a good reason format-process-per-file isn't the default? Since formatting doesn't care about dependencies it should almost always be the faster choice on a multi core machine
h
Yes, it was super slow when I benchmarked in 2020. Although that was before excellent performance improvements John made to get rid of overhead with PEX. So it'd be worth re-benchmarking! https://docs.google.com/document/d/1Tdof6jx9aVaOGeIQeI9Gn-8x7nd6LfjnJ0QrbLFBbgc/edit#heading=h.oyst624c2sph
Another issue is that the UX of
--per-file-caching
is obnoxious on cold runs, soooo much verbose output
b
That's actually why my default level is warning 🤣
But that messes with other systems (like stats log and show target labels)
h
Ah, how about using --log-levels-per-target for that? (Thanks again for your fix!!)
I guess we need to finish releasing that fix, oops. We keep slipping in more fixes
b
I guess I have it inverted. Warning by default and whitelist info
But yeah I'm using log levels per target for the whitelisting
h
Another area where that obnoxious UI, plus performance, could be improved by "process grouping" - dep inference
👆 1
b
I think the prodash UI cleans things up nicely, FWIW
And yeah I wonder if there's another possible partitioning scheme of (pick a magic constant) 1.5-2-ish processes-per-core total (running on the unlinted) The multiplier ensures one process doesn't become a tentpole. And over a sufficient number of runs, it acts much like per-file process, as the cache fills up with chunks of files
And doesn't require synthetic results
So coming back to this, and on topic... In the cold case, doing
fmt-process-per-file
is always faster. In the hot case, doing
fmt-process-per-file
is actually slower, with the results increasing in time as the # of files go up (which makes sense)
Screenshot from 2022-01-05 09-56-47.png
As a baseline, running
yapf
and
isort
directly on the repo (with parellelization) takes roughly 22 seconds
So I guess the takeaway here is if you're using
--changed-since
it might not matter that much (until batching or similar is implemented). Probably go with all-in-one. Otherwise, if you're running on the world (perhaps in CI) then process-per-file is a good idea