Hi, on a high level, is there any difference betwe...
# general
w
Hi, on a high level, is there any difference between running
./pants fmt2
vs running
black
directly on CLI for adhoc files? If yes, does the pants invocation pick up additional config?
h
Some thoughts I sent a user a few days ago: Some benefits imo of running with Pants: * Fewer commands for your coworkers to remember; they only need to do ./pants fmt :: and ./pants lint ::. They don’t need to remember the differences between, say, isort --check-only vs. black --check. * Can have Pants only run on changed files via ./pants --changed-*. V2 will also cache the results of unchanged files and avoid re-running over those. * Can use remote execution and remote caching, so you won’t need to rerun on files if your coworker ran on the same ones without any changes * For Flake8, Pants will resolve the interpreter version specific to that target. Meaning, if you have some Python 2 only targets and some Python 3 only targets, Flake8 will use the correct Python version for each target. (Flake8's output depends upon which interpreter was used) And a downside at the moment: * Right now, the V2 linters have worse performance than we’d like. We’re actively working to address this
a
actively working where?
h
This is the proposal and https://github.com/pantsbuild/pants/pull/9185 is the next PR up for review. (I’d love a review if you have a moment today or tomorrow!)
With the short term proposal, Pants likely still won’t have as good of performance as straight Black. With the medium term proposal, it probably won’t be as good as Black in particular because Black has it’s own cache, but overall we expect the V2 linters to be faster than running directly because 5/6 of the tools don’t have any caching mechanisms
a
why aren't we using the black cache?
this is relevant for pex too
w
That’s very helpful! Thank you. I was mostly playing with it while trying to get over some merge conflicts per instructions on https://github.com/pantsbuild/pants/commit/29cf9fc8b25a540f32e4a9e0aeeaaa6ea4dcc48f 🙂
❤️ 1
h
why aren’t we using the black cache?
Possibly, we will want to use the Black cache one day. Although, I think Pierre found that it has system-specific information like what OS you ran, which means that it would not be safe for remote execution For now, we’re focused on improving performance for all 6 Python linters rather than focusing on specifically the 1 tool that already has a cache
1
The other tool that’s very relevant to having its own cache is MyPy. MyPy is an order of magnitude faster when it’s cache is warm
a
i wasn't thinking we'd use the black cache with remote execution
h
i wasn’t thinking we’d use the black cache with remote execution
Possibly, we don’t in the future. But that would involve a whole new layer of infrastructure for the engine to say “I want to be able to run this with RBE, but don’t capture the cache output files iff it’s RBE”. Certainly possible, but not where Toolchain plans to invest it’s time for now. For now, the focus is https://docs.google.com/document/d/1Tdof6jx9aVaOGeIQeI9Gn-8x7nd6LfjnJ0QrbLFBbgc/edit#heading=h.oyst624c2sph
a
Certainly possible, but not where Toolchain plans to invest it’s time for now.
if black and mypy are significantly faster with it, then it's not clear what focusing on 100% remote execution gets us. twitter had to shutter a large project last year because of the desire to make everything work 100% with remote execution and refusing to listen to the facts on the ground. just a warning.
👍 1
h
Yeah, I think the win is caching, not necessarily remoting. But in no cases should we say "performance will only be acceptable if you have a buildfarm"...
👍 1
a
i think the black and mypy caches could be incorporated as well as the normal process execution caching to improve performance over remoting
h
i think the black and mypy caches could be incorporated as well as the normal process execution caching to improve performance over remoting
Very likely, yes. We’re only saying that it’s not Toolchain’s current focus to leverage their pre-existing caches but it is very plausibly a good future optimization. And +1 to Benjy’s point about the importance of performance for local execution
a
what performance goal are you aiming to achieve?
f
Hi, on a high level, is there any difference between running
./pants fmt2
vs running
black
directly on CLI for adhoc files?
One thought here, in addition to all that’s been said: I don’t think there should be a stigma for running black manually: if you happen to have the tool installed and you’re comfy running
./black my-directory
, that’s totally OK too. It will do the exact same thing as
./pants fmt2
since we don’t pick extra configs. In particular, I think that setting up the black plugin from your favourite text editor and enabling format on save without any hackery to make your editor aware of pants is a great usecase for “raw black usage” 🙂
👍 2
h
what performance goal are you aiming to achieve?
Not any particular number. Right now, running
./pants fmt2 ::
on a cold cache is 5-30x slower than the tools directly. We want to get that number closer to 1-2x slower (for now) With the medium term proposal in that doc, we’re okay with slightly slower on a completely cold cache but want it to be faster to use Pants than the direct tools when the cache is even only a little warm,