Would it makes sense to extend `per file caching` option to Pants #plugins

Would it makes sense to extend `per_file_caching` ...

bitter-ability-32190

11/11/2021, 4:30 PM

Would it makes sense to extend

per_file_caching

option to each linter itself? I'm thinking some linters are faster than others (

pylint

traverses imports and does crazy inference, whereas

black

does very little work). The tipping point here would be something like

pylint

is faster to read from the cache than run, but for

black

the difference might be in the noise

hundreds-father-404

11/11/2021, 4:34 PM

Definitely reasonable!! It would override the global default It will require changes to the Plugin API, but that may be worth it -- Would you have a chance to benchmark this idea to see if it's actually worth doing, and then file a feature rquwst if so? Maybe use something like hyperfine to have Pants run Black in isolation, followed by Pylint in isolation. See https://www.pantsbuild.org/v2.8/docs/contributions-debugging for how we benchmark pants

🙌 1

bitter-ability-32190

11/11/2021, 4:36 PM

Sure thing!

bitter-ability-32190

11/11/2021, 4:57 PM

I'm almost afraid to run

--lint-per-file-caching

pylint

in our repo 😂

bitter-ability-32190

11/11/2021, 4:57 PM

I'll report back... next year

hundreds-father-404

11/11/2021, 5:15 PM

Haha I hear you there, I recommend running over a subset with

./pants lint dir::

dir:

bitter-ability-32190

11/11/2021, 5:34 PM

OK so I'll need to check my understanding on this one. Obliviously

isort

(we don't use

black

gets less performant), but

pylint

did as well, by a long shot...

bitter-ability-32190

11/11/2021, 5:35 PM

pylint

went from ~15s one-process to ~72s per-file. Does Pants still spin up the subprocess if the result should be cached?

hundreds-father-404

11/11/2021, 5:36 PM

This is what I found in February 2020: https://docs.google.com/document/d/1Tdof6jx9aVaOGeIQeI9Gn-8x7nd6LfjnJ0QrbLFBbgc/edit#heading=h.oyst624c2sph (Sorry, forgot I had this doc)

hundreds-father-404

11/11/2021, 5:37 PM

Does Pants still spin up the subprocess if the result should be cached?

This is where you will want to test both cold cache and warm cache, using the techniques from the Pants Debugging page I sent Cold cache is always going to be slower with per-process. But the question is how much worse, and how much faster is warm cache where it's worth it?

hundreds-father-404

11/11/2021, 5:37 PM

To simulate the warm cache, you'd want to do something like change 1 file out of 50. Still use pantsd and caching Whereas cold cache should disable both

bitter-ability-32190

11/11/2021, 5:38 PM

So to start I am only testing with warm cache and saw the above results 🤔

bitter-ability-32190

11/11/2021, 5:39 PM

No files changed, so ideally the warmest cache

bitter-ability-32190

11/11/2021, 5:41 PM

I guess with the cache I'd expect after an initial run with either option, the run is blazingly fast with no edits. 🤔

hundreds-father-404

11/11/2021, 5:53 PM

It should be! It's not caching the result? If so, that's a bug

bitter-ability-32190

11/11/2021, 5:57 PM

OK so one thing I noticed was that that stdout was being blasted by

11:51:08.36 [INFO] Completed: Lint using Pylint - Pylint succeeded.

so I'm going to try again with

-lwarn

. I also noticed I was running with

./pants_from_sources

. Perhaps that does something different than just

./pants

bitter-ability-32190

11/11/2021, 6:00 PM

No, the timing was right:

Copy code

joshuacannon@CEPHANDRIUS:~/work/techlabs$ time ./pants -lwarn --isort-skip --yapf-skip --lint-per-file-caching lint ::
...
real    1m14.204s
user    0m0.398s
sys     0m0.058s

hundreds-father-404

11/11/2021, 6:03 PM

./pants_from_sources

doesn't use pantsd by default

😅 1

bitter-ability-32190

11/11/2021, 6:06 PM

Something must be funky with my setup 🤔

bitter-ability-32190

11/11/2021, 6:17 PM

Although, I'd hope that even without the daemon, the disk-cache would make this fastm but perhaps the 70-ish seconds is mostly reading the cache?

hundreds-father-404

11/11/2021, 6:19 PM

It definitely should not be. To confirm,

./pants --no-pantsd lint path/to/f.py

is not caching from disk when you run it three times in a row? Regardless of

--per-file-caching

? I don't reproduce, so trying to figure out what's up

bitter-ability-32190

11/11/2021, 6:27 PM

It is:

Copy code

12:26:59.55 [INFO] Counters:
  local_cache_read_errors: 0
  local_cache_requests: 36
  local_cache_requests_cached: 27
  local_cache_requests_uncached: 9
  local_cache_total_time_saved_ms: 145684
  local_cache_write_errors: 0
  local_execution_requests: 9
  local_process_total_time_run_ms: 952

I guess if there was a good way to see where the time was being spent by

pants

, this'd be a lot easier to reason about 😕

bitter-ability-32190

11/11/2021, 6:28 PM

(NOTE: That info is from a run on a subset of our monorepo)

bitter-ability-32190

11/11/2021, 6:29 PM

(I'm about to jump off for the day, we can pick this up tomorrow)

👋 1

4 Views

Open in Slack

Previous Next