I’m profiling some pex executions… It looks like a...
# pex
a
I’m profiling some pex executions… It looks like about 75% of the overhead of running a pex right now is importing things from
pex.third_party
- does this sound about right, and do people have thoughts on how to improve that?
👀 1
e
Not sure if that's right but could be. There are two interesting questions there: 1. Is the PEP-302 import hook stuff too slow. I.E.: is the vendoring approach taken inherently slow as a result. 2. Is the set of things being imported too large. I.E: Is
pkg_resources
use hurting us. Its widely observed that
pkg_resources
via its global
working_set
variable, scans the full classpath and does alot of often un-needed work that can be slow.
a
If the latter, how awful would it be to re-implement what we use from
pkg_resources
?
e
Not sure.
Its not light stuff, it goes into the core of PEXEnvironment.
I suspect not bad though.
Perhaps the quick experiment with the latter is to hack up the vendored copy of pkg_resources to not have the global
working_set
variable. Pex does not use it.
👍 1
a
So far I’m at: Replacing https://github.com/pantsbuild/pex/blob/b6681fbafe30b36b40349f4869bada4ff757f152/pex/pex_builder.py#L54-L55 with: printing a line and exiting: 144ms
import pkg_resources
and exiting: 214ms
import pex.third_party.pkg_resources
and exiting: 303ms
pkg_resources
appears to a bunch of expensive initialisation, e.g. compiling a bunch of large-ish regexes, on import, which we then don’t actually use when running a pex.
The vendored importer appears to also be pretty expensive
e
Have you compared the vendored importer to importing the vendored code via standard imports? That's the useful difference. We can't avoid the imports unless we jettison the code we import, we can import it faster if the 302 adds too much overhead over the standard import mechanism.
More broadly, at runtime pex has to do alot of calculation to set up an isolated venv. If we hashed more things about the PythonInterpreter selected, a more generic warm run win might be to cache the calculations against the cache interpreter. On a re-run with the same interpreter, the needed modifications to sys.path would be read off disk as already known.
a
It looks like importing the vendored code from disk is comparable to using the vendored importer
e
That's what I guessed. Python dogfoods this 302 mechanism on modern Pythons.
a
Interestingly, it looks like it ends up getting imported twice… I wonder why…
e
I'd dig into the 2x proof a bit 1st.
a
I just added a print statement to the top of our vendored
pkg_resources/__init__.py
to verify the right thing was being imported, and when I run a pex file I get that line printed twice
That’s not expected, right?
Aah I’ll keep digging…
e
a
If the no-op cost of running
pex
is of the order of 150ms, that feels like pretty expensive reuse…
e
I'd be interested in the no-op cost of 1.6
Iff its significantly better, I agree. If not, I don't
And its not a wonton re-use. Both cases fix isolation bugs, so some careful substitute is needed.
a
I’m seeing about 120ms for 1.6.12, so comparable
e
And note 1 and 2 above use the pex cache. Its only a hit on run 1 of a pex on a machine (if the pex cache (PEX_ROOT) is not disabled or ephemeral).
a
To make sure my rough feel for what running a no-deps pexfile does compared to just running a file in a python interpreter; I think the only things it additionally does are (at a high level): • Probe interpreters, select one • Munge sys.path to isolate from system packages Is that about right, or are there extra important things?
e
No, just those two broad steps.
1 is cached so it should only be a cold run hit. 2 is not cached at all.
2 uses pkg_resources.
a
Cool, thanks 🙂 I will keep poking