https://pantsbuild.org/ logo
#development
Title
# development
b

bitter-ability-32190

10/21/2021, 5:30 PM
FYI @witty-crayon-22786 @chilly-magazine-21545 I just saw Issue 10864: Improve MyPy performance which talks about MyPy's persistent cache and challenges because of it. I'm volunteering some personal time to implement something similar for
astroid
issue link, which powers
pylint
. The overall approach isn't dissimilar from `mypy`'s initial implementation (per-module JSON stored in a cache dir. Look at byte-length+mtime to know whether to re-parse). Happy to take future requests to make it
pants
-compatible
👀 1
w

witty-crayon-22786

10/21/2021, 5:43 PM
nice!
we’ve come a long way on deciding how much flexibility to allow for caches, so i expect that we can make something work there.
even without its caches, mypy is a lot faster than pylint though, so it hasn’t been as much of a priority
b

bitter-ability-32190

10/21/2021, 5:46 PM
I currently have a very ugly WIP PR which handles the transformation from Python object to JSON: https://github.com/PyCQA/astroid/pull/1194 No code for the actual persisting logic, but that's easy enough to write (I have it offline) and it'll likely be in a follow-up PR
I suppose that's what happens when you have Guido-level visibility/development on a project 😂
w

witty-crayon-22786

10/21/2021, 5:47 PM
😃
the big question around mypy (and probably pylint) is whether the cache needs to be per-repository, or global. @enough-analyst-54434’s comment at the end of the thread about enabling the cache being global would make things easier for us.
b

bitter-ability-32190

10/21/2021, 5:48 PM
Also FWIW
astroid
love-it-or-hate-it has a harder job, as it tries to infer a shitton of info operating only on the AST.
mypy
AFAIK does actual importing so it has to infer much less
IIUC the cache key is absolute paths, so "global"
w

witty-crayon-22786

10/21/2021, 5:55 PM
yea. the issue (which I had mostly forgotten: quite a thread) is that Pants runs things in sandboxes
so the absolute path would end up being a sandbox path, and would then need fixing.
John’s comment about switching to digests avoids that issue. but unclear how much work it would be upstream.
b

bitter-ability-32190

10/21/2021, 5:56 PM
I'd be surprised if it became that much work, honestly.... but I've been surprised before 🙂
Do you have a doc page on Pant's sandboxing? I'm familiar with Bazel's but not Pants'
w

witty-crayon-22786

10/21/2021, 5:58 PM
the actual implementation isn’t very well described, but the public API is: https://www.pantsbuild.org/docs/rules-api-process
inputs and outputs are specified using digests (with the same underlying datastructures as for Bazel, because Pants supports remote execution with the REAPI)
b

bitter-ability-32190

10/21/2021, 6:05 PM
Are the paths inside the temp folder copied or linked? Could the links be resolved by the underlying app (`mypy`/`pylint`)?
w

witty-crayon-22786

10/21/2021, 6:05 PM
copied
b

bitter-ability-32190

10/21/2021, 6:13 PM
Ah, yes. Digest would be a hard requirement then
w

witty-crayon-22786

10/21/2021, 6:24 PM
i expect that mypy’s sqlite schema could be adapted pretty easily (behind a flag even)… and yea, digest keyed would probably be a good first place to go for pylint
one challenge that i haven’t thought through is how dependencies are accounted for in that cache: it’s possible that mypy is relying on timestamps to bump cache entries when the dependencies have changed (despite their direct content being identical). and that complicates things cc @enough-analyst-54434
b

bitter-ability-32190

10/21/2021, 6:48 PM
I believe that is true
FWIW I chose path+mtime (and I suspect mypy did as well) is because that's what Python uses for pyc files