Has anyone thought about compiling Pants with `myp...
# development
b
Has anyone thought about compiling Pants with
mypyc
? šŸ¤”
Pants is already typed, so it seems like it'd be a speedup (at the expense of compilation and all that entails)
Could be a fun project...
h
No no one has šŸ™‚ It would be cool. I wonder if it would be hard to get the native extension working properly, but that's only speculation
b
I wonder if you could eliminate the roundtrip by the runtime form Rust -> Python runtime -> Python compiled binary and just go from Rust -> compiled code
āž• 1
It'd also be neat if Pants provided a plugin to compile with
mypyc
. Then Pants could just dogfood the compilation.
w
In progress. :)
šŸš€ 2
šŸ™Œ 2
šŸ‘€ 2
šŸŽ‰ 1
b
I wonder if you could eliminate the roundtrip by the runtime form Rust -> Python runtime -> Python compiled binary and just go from Rust -> compiled code
I suppose this is approximately the overhead of an additional C function call though šŸ¤”
Also @wide-midnight-78598 you've got me really hyped. šŸ™Œ
w
I'm pretty pumped too - if it works half decently, I can start dropping languages like flies. Working on getting a trivial prototype running today (https://github.com/sureshjoshi/pants-plugins/blob/2-mypyc-prototype/hellofib/BUILD.pants#L3-L14)
b
Godspeed
I've officially dropped this. I should've done it sooner, but I held onto hope. This killed it though, probably should've read that first! https://mypyc.readthedocs.io/en/latest/differences_from_python.html
šŸ˜¢ 1
šŸ‘ 1
(Could be useful to profile
pants
run, and then try to
mypyc
the a file or two, though)
w
Mad respect for the effort. So, I had originally thought you were doing a profile and compile mission, then I found out it was the whole-scale approach. Seemed monumental, but would have been dope
šŸ™Œ 2
b
A surprising amount of files were compilable. Although I imagine when it came time to execute, the real pitfalls would've exposed themselves (e.g. you can't
inspect
the caller's frame, which is how
collect_rules()
works)
I think it's totally doable
mypy
is proof of that. You just have to limit yourself to the contraints
mypyc
places.
w
Interestingly, I bet there is some way to auto-discover which files would be compilable - based on a set of rules about what the decorators in Pants do. Actually, I think the best improvement would come from being able to compile all the tooling that we import. Like, all the linters, formatters, etc - get compiled on installation, and then just native Python to bundle all those rules together. Much, much, much harder to do and maintain, but its one of those pipe dreams. I'd also be curious if Pex'ing could be sped up, not that I find it slow - but I haven't ever looked at that codebase, so šŸ¤·
šŸ¤” 1
b
Compiled
pants
uses a compiled
pex
which on installation of
black
compiles it?
w
Well, not compiled Pants in that case. But a compiled pex runner, with compiled Black, yes
Err, compiled pex... compiler? Bundler?
b
Why not Pants as well? šŸ¤” Gotta go fast
w
Lol, if possible, sure - turtles all the way down
b
I've also thought about the implication on testing. If you are compiling your source, you want to test with the compiled source to ensure the compiled version works as expected (the Py source acts as a fallback + a source of truth for the code)
(Mypy devs say they test with the compiled source)
w
for sure, testing on intermediate code is whack
This would definitely put a wrench in my "compile tools as downloaded" approach šŸ¤·ā€ā™‚ļø
b
How so?
w
Well, lets say you download an sdist of a tool, which has great testing - then you mypyc it - and now you have a compiled binary that hasn't been tested that you're trying to use
If that was upstreamed, different story
b
Make it a config setting because YMMV?
šŸ’” 1
Anywho, I think these are unfortunately pipe dreams šŸ˜­
w
Agreed... One day... One day... [stares off into distance]
b
A step in the right direction would be to: ā€¢ upgrade
mypy
in repo. Note that newer
mypy
is somewhat stricter ā€¢ Move towards full typing in-repo and
--strict
šŸ‘ 2
h
Bummer, thank you for exploring this though! +1 to the suggestions for moving forward Re: compiling, Pants runs PEXes in venv mode which has near-native performance. https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode But one remaining issue is first-party code still has to be compiled into
.pyc
https://github.com/pantsbuild/pants/issues/11339
w
@hundreds-father-404 "native" meaning, native python speeds? Or native machine code speeds?
b
The former (from reading the docs).
w
That's what I would have assumed - and it would line up with my anecdotal testing of pex'd projects
h
native python speeds
šŸ‘ 1
b
FWIW the test-consuming-compiled-python wouldn't be tooooo hard in Pants, as Pants has cached codegened files šŸ¤”
šŸ‘† 1
h
the real pitfalls would've exposed themselves (e.g. you can't inspect the caller's frame, which is how collect_rules() works)
Would it be helpful to memorialize the concrete issues you encountered in a GitHub issue perhaps? Iirc it was decorators like
@frozen_after_init
and then
collect_rules
? On par, I continue to personally prioritize end-user experience over plugin author experience. (Although of course important to balance that). So I'd be curious to see all the things we'd have to "give up" for better perf To be clear, please don't go on a wild goose chase tho trying to catalog even more issues. Makes sense to not try pursuing this more for now. Rather, to memorialize the work you already did
b
From what I remember: ā€¢ All properties must have return type annotations ā€¢ No inner classes or functions ā€¢ No dynamic attributes with normal attribute access, gotta use `getattr`/`setattr` ā€¢ No class decorators or any class funny-business aside from normal ones like
dataclass
ā€¢ comparison operators must return bools ā€¢ ...probably much more but didn't inspect the compiler output in each case
It's not hard to recreate this list, so I think the first thing would be to move towards 100% typing,
--strict
, and minimizing `cast`/`# type: ignore`
šŸ‘ 1
w
Wait, no inner functions?
šŸ‘€ 1
Or, no returning inner functions?
b
I think both, but I'm going on memory here
Maybe I'm wrong, or else decorators wouldn't work šŸ˜
h
Yeah I'd be surprised if those didn't work, agreed
b
I'd honestly be surprised either way. If they don't work then how do you do interesting things. If they do work, how do they handle captured variables in C-land?
w
I mean, I assume they'd just be handled as function pointers, like everything else in C šŸ™‚ I just did a
mypy --strict
test, and using an inner function works fine, but returning an inner function requires some finesses methinks, as it complained about something that I don't think was tru https://mypy.readthedocs.io/en/stable/common_issues.html#narrowing-and-inner-functions
šŸ‘ 1
b
Sorry, to be clear, I meant that
mypyc
may/may not allow inner functions with captured variables. As those can't be converted to native C functions easily
w
ahh, okay okay.
h
Another approach to speeding up Pants a la PyOxidizer šŸ˜„ https://pantsbuild.slack.com/archives/C046T6TA4/p1646748241557059
b
Yeah I think the ultimate Pants endgame would be both: ā€¢ PyOxidized app. Benefits: Single-file installation, Pants-provided Python runtime (embedded via PyOxidizer), latest Python version (for features + speed) ā—¦ I think another lesser point here is a win of Pants-provided Python installations for the user ā€¢
mpypyc
compiled sources. Benefits: super speedy Python code
āž• 2
w
On my way home just now, I was just thinking about Python and PyOx. There are a few issues - not pants-specific, but workflow specific that I would love to see solved (can you tell I'm about to take on another project that I'm trying to convince the clients to let me use Python for?) ā€¢ Developer productivity - Python is a no brainer here ā€¢ Performance - Even with 3.11 optimizations, certain pieces of code might need to be sped up depending on how often they run, so getting
mypyc
and/or Cython up and running would be great. I currently use Cython for running Python on serverless deployments - pays itself off immediately, every call ā€¢ Single-file deployment to remove Docker as a dependency ā—¦ PyOx is only sometimes single-file deployment, depends on a bunch of factors - I'm trying to find ways to crack this one - so we can reliably assume single-file deployment, and having a folder of libs is some edge case
āœ… 2
h
Have you tried PyPy btw? I've always wanted to try it with Pants but haven't really dabbled because of Pants's distribution model
w
Tried it a long time ago, never really stuck with any Py variants
So here's something I just don't have enough knowledge about... So, Pyox embeds a python interpreter, and then packages wheels or source for consumption later. Is there any reason that the embedded interpreter couldn't just directly run a .pex file? I mean, the pex already bundles all the resources, so ignoring any potential performance hits - is there any reason that wouldn't be possible?
I'm not sure if pexes may be run from within an in-memory context, or whether they necessarily have to be filesystem-based
h
Oh that's an interesting angle. I think you'd want to experiment with
venv
vs
zipapp
https://www.pantsbuild.org/v2.10/docs/reference-pex_binary#codeexecution_modecode
w
venv
is the way to go methinks, I'm just trying to see if PyOxidizer will even run that as a module, and if I can bundle it as a resource. I'm just not sure, theoretically, if that should or shouldn;t work. Reading through the docs to find out, hoping for something that works in any way
šŸ‘ 1
Actually, is there any way to run a
.pex
as a module?
I feel like I've seen this before, and may have even asked this before
h
How do you mean "run as a module"?