FYI < witty crayon 22786> < chilly magazine 21545> I just sa Pants #development

FYI <@U06A03HV1> <@U01D2SF5DU0> I just saw <Issue ...

bitter-ability-32190

10/21/2021, 5:30 PM

FYI @witty-crayon-22786 @chilly-magazine-21545 I just saw Issue 10864: Improve MyPy performance which talks about MyPy's persistent cache and challenges because of it. I'm volunteering some personal time to implement something similar for

astroid

issue link, which powers

pylint

. The overall approach isn't dissimilar from `mypy`'s initial implementation (per-module JSON stored in a cache dir. Look at byte-length+mtime to know whether to re-parse). Happy to take future requests to make it

pants

-compatible

👀 1

witty-crayon-22786

10/21/2021, 5:43 PM

nice!

witty-crayon-22786

10/21/2021, 5:44 PM

we’ve come a long way on deciding how much flexibility to allow for caches, so i expect that we can make something work there.

witty-crayon-22786

10/21/2021, 5:45 PM

even without its caches, mypy is a lot faster than pylint though, so it hasn’t been as much of a priority

bitter-ability-32190

10/21/2021, 5:46 PM

I currently have a very ugly WIP PR which handles the transformation from Python object to JSON: https://github.com/PyCQA/astroid/pull/1194 No code for the actual persisting logic, but that's easy enough to write (I have it offline) and it'll likely be in a follow-up PR

bitter-ability-32190

10/21/2021, 5:47 PM

I suppose that's what happens when you have Guido-level visibility/development on a project 😂

witty-crayon-22786

10/21/2021, 5:47 PM

😃

witty-crayon-22786

10/21/2021, 5:48 PM

the big question around mypy (and probably pylint) is whether the cache needs to be per-repository, or global. @enough-analyst-54434’s comment at the end of the thread about enabling the cache being global would make things easier for us.

bitter-ability-32190

10/21/2021, 5:48 PM

Also FWIW

astroid

love-it-or-hate-it has a harder job, as it tries to infer a shitton of info operating only on the AST.

mypy

AFAIK does actual importing so it has to infer much less

bitter-ability-32190

10/21/2021, 5:50 PM

IIUC the cache key is absolute paths, so "global"

witty-crayon-22786

10/21/2021, 5:55 PM

yea. the issue (which I had mostly forgotten: quite a thread) is that Pants runs things in sandboxes

witty-crayon-22786

10/21/2021, 5:55 PM

so the absolute path would end up being a sandbox path, and would then need fixing.

witty-crayon-22786

10/21/2021, 5:56 PM

John’s comment about switching to digests avoids that issue. but unclear how much work it would be upstream.

bitter-ability-32190

10/21/2021, 5:56 PM

I'd be surprised if it became that much work, honestly.... but I've been surprised before 🙂

bitter-ability-32190

10/21/2021, 5:57 PM

Do you have a doc page on Pant's sandboxing? I'm familiar with Bazel's but not Pants'

witty-crayon-22786

10/21/2021, 5:58 PM

the actual implementation isn’t very well described, but the public API is: https://www.pantsbuild.org/docs/rules-api-process

witty-crayon-22786

10/21/2021, 5:59 PM

inputs and outputs are specified using digests (with the same underlying datastructures as for Bazel, because Pants supports remote execution with the REAPI)

bitter-ability-32190

10/21/2021, 6:05 PM

Are the paths inside the temp folder copied or linked? Could the links be resolved by the underlying app (`mypy`/`pylint`)?

witty-crayon-22786

10/21/2021, 6:05 PM

copied

bitter-ability-32190

10/21/2021, 6:13 PM

Ah, yes. Digest would be a hard requirement then

witty-crayon-22786

10/21/2021, 6:24 PM

i expect that mypy’s sqlite schema could be adapted pretty easily (behind a flag even)… and yea, digest keyed would probably be a good first place to go for pylint

witty-crayon-22786

10/21/2021, 6:26 PM

one challenge that i haven’t thought through is how dependencies are accounted for in that cache: it’s possible that mypy is relying on timestamps to bump cache entries when the dependencies have changed (despite their direct content being identical). and that complicates things cc @enough-analyst-54434

bitter-ability-32190

10/21/2021, 6:48 PM

I believe that is true

bitter-ability-32190

10/21/2021, 6:49 PM

FWIW I chose path+mtime (and I suspect mypy did as well) is because that's what Python uses for pyc files

Open in Slack

Previous Next