Also metrics from "intrinsic parsing" are IN <htt...
# development
b
Also metrics from "intrinsic parsing" are IN https://github.com/pantsbuild/pants/pull/18854 TL;DR: • In the worst case performance of dependency inference in the Pants repo went from 72.388 s ± 10.145 s down to 15.385 s ± 0.150 s. Larger repos = larger rewards. • In the best case (super hot cache) performance didn't change. ◦ We're not consulting the cache anymore, we're just always re-parsing. ◦ Put another way, reparsing the files is just as fast as computing cache keys and pulling from the cache.
🏎️ 2
🙌 1
🚀 3
CC @witty-crayon-22786 and @happy-kitchen-89482 This was a fun thing to get working on at PyCon ❤️
I wonder if we could do the same thing using tree sitter generated rust code for any number of langugaes
w
very nice 😃
relatedly: i don’t think i’ve ever done a benchmark of “maximally tiny” process spawning… would be interesting to see what our overhead is in that case, for if we ever do want to put it out-of-process
b
Yeah. I plan on putting a pretty bow on this in a different crate, and exposing it as a CLI with a tiny python shim. It's a useful utility outside of pants
h
Neat!!
We probably still want to cache, for large repos?
b
We don't have to. That's the beautiful thing. The cache is just as slow as re-parsing. Meaning we're only every strictly faster. But that's not to say we don't sprinkle in a little local caching to make it EVEN FASTER
I should point out, the result is still memoized.
b
Would caching still be useful with a larger batch size? (Ie not paying for overhead per-file?)
b
Ummmm put another way?
This way doesn't have any overhead, realistically, for the number of files because this is happen on rule threads
w
The cache is just as slow as re-parsing.
or close enough as to not matter, probably… but can confirm by looking at the workunits
b
Yeah it'd be great to back that claim up with hard(er) numbers. I think sprinkling in the local cache (that the URL verifier uses) would make this an all-around win
h
Yeah, I guess the next perf frontier for import extraction would be running in batches
b
To be very crystal clear here (because I'm getting a little confused). This thing does no caching whatsoever. It also doesn't run any processes. In rule threads in the Pants process we just rip through Python files at lightning speed. So there is nothing to batch. At least for this implementation. If you're talking about other languages, then yeah maybe. Or just do more of this, because the numbers are AMAZING
h
So now I'm confused too - this is an engine intrinsic? (I should look at the code...)
b
Yup. (You might want to reread the first sentence of the original post 😅)
h
Hmm, very cool, but does this not violate the plugin boundary? We now have very Python-specific code in the engine itself
So it's the first time we're letting a standard plugin do something a custom plugin author can't do
b
That's not technically true. Any plugin could embed a binary and do C calls into it instead of running a process. The reason it's ok-enough in the pants engine is because: 1. We already have a conveniently embedded binary already 2. We already were going to add the Python parsing library for the sake of parsing rule code in the engine (faster startup time). So adding this intrinsic is negligible in size
e
FWICT the important thing to keep track of is the latency of / reliability of Python upgrade support. It looks like 2.12 only adds 1 new bit of supported syntax for bindings in comprehensions, but that presumably would be an issue here to Pants supporting Python 2.12 in October. Since ruff uses this and is popular, it may be this is all fine and upgrades will happen promptly.
b
The parser is autogenerated, for the most part 😉
e
Sure, but I don't believe that fully answers the concern. At any rate, I too am a bit concerned at a broader view that we're having to hack around fundamental slowness in the infra. This is probably a good win in the short to medium term - it's invisible and reversible without disrupting users if need be - but the latency issues with the model are troubling.
w
The alternative "infrastructure" in this case is a bare invoke of a python interpreter. Yea, python should be faster. And yes, we could hypothetically use nailgun with Python. And batching has plenty of challenges and complexity. But there also comes a point where it is simpler to use a new language. I'm in favor of doing that in this case. If people have an issue with it being an intrinsic, then it could be an external process instead, but I don't personally think that that is necessary, for the same reason we don't have swappable released versions of the dep extraction logic currently.
b
Also doesn't have to be a process. We could just call a C API as well from rule code. That's my point, we're just putting it where it's convenient because we're already shopping a binary which has (or was about to have) a Python AST visitor.
(although to be completely fair, the intrinsic gets free memoization)
Oh wow. So I'm also considering tree sitter, and although the interface is kinda hard to grok (using string queries in its own query language) it would be able to support Python3/Python2/etc...