I think `export` is much slower with a pex-based l...
# general
h
I think
export
is much slower with a pex-based lockfile. Is this expected?
Before I switched over to generating pex lockfiles, I was seeing an export time of about 12 seconds on fresh runs and less than a second on successive runs. Now I'm getting anywhere from 44 seconds to over a minute.
h
No, I would not expect that. Note that we made a change in 2.11 to include PIP in the virtual environment. Do you have a checkpoint where you are using 2.11 but with poetry lockfiles? To confirm that this is about pex versus poetry rather than 2.10 versus 2.11
h
Actually. It might be because I was testing previously with the VCS requirement commented out. I know pip doesn't cache vcs installs, but does pants do anything intelligent with that? We specify the full commit hash so there's definitely enough information there.
removing the vcs dep from requirements.txt and regenerating the lockfile and running export didn't seem to change things. Still takes ~45 seconds best case.
I might have been working with 2.10 when testing previously
Will get some more data
h
In 2.10 we "exported" a symlink directly into the pex_root cache, which was dangerous for a few reasons, and also meant that the resulting virtualenv didn't include pip and couldn't be easily used in various ways
In 2.11 we actually create a "proper", separate virtualenv, so I can see how that would take longer
Although the resolve and downloads and all should still be used from cache
h
Okay here's my findings • v2.10.1rc0, no VCS, lockfile from pip-tools, export takes 43 seconds. Running immediately after and it takes 20 seconds. Ran again and it takes 1 second each successive time. • v2.10.1rc0, with VCS, lockfile from pip-tools, export takes 2m30s. Running immediately after and it takes 22 seconds. Running again takes 1 second and all successive runs take that long. okay so VCS isn't the issue (though weird that it takes three runs to reach its best caching). • v2.11.0, with VCS, lockfile from pants with pex, export takes 58 seconds. Running again takes 41 seconds. Again at 38 seconds. Again up to 1m2s. Again 46 so doesn't seem like it's converging like it used to. • v2.11.0, no VCS, lockfile from pants with pex, export takes 1m7s. Running again takes 48 seconds. Again at 46 seconds. Doesn't show signs of converging. Okay so 2.11 changes things but unclear if it's 2.11 or pex lockfile. Revert back to pip-tools controlled lockfile and see what happens. • v2.11.0, no VCS, lockfile from pip-tools, export takes 2m50s (a lot of time installing constraints file). Running again is 47 seconds and that seems to continue indefinitely. saving sanity and going to assume adding back VCS doesn't make it better. So I'm fairly confident it's just the change from 2.10 to 2.11.
I'd take a little danger for 40x speed improvement 😉
h
I do wonder why it is taking so long, when you already have a lockfile and all the wheels should already be downloaded/built and cached
@enough-analyst-54434 Any thoughts?
How many dists in the resulting virtualenv?
h
Somewhere in the neighborhood of 200
e
No clue from the Pex side. There is alot of complex stuff going on above though in Pants too. As always it would be best to get a shareable repro if possible.
I mean, the change you mentioned Benjy explains it all I think. Since there is no symlink, a VCS clone is happening every time. Pex does not trust any VCS URL to be stable and so both Pip cloning the repo and checking out the given branch, tag or commit happens every time, and then Pex sha256 hasing that repo tree happens every time to confirm no float.
To speed this up PEX would need to relax a bit and trust certain VCS urls, like commit ids for cryptographic VCS (git commit IDs but not svn, for example) but not tags or branches.
That's assuming the git clone time here roughly aligns with 1 minute, explaining the non-convergence. Is that the case @high-yak-85899?
h
I think this is slow even without the VCS requirement though?
e
I don't know, I'm having a hard time following. That, at least, is the only thing I can think of on the Pex side of the world.
h
It's possible I'm misunderstanding how the VCS dependency gets included. I was running
./pants export ::
with and without a VCS dependency in my
requirements.txt
(and implicitly the lockfile), but maybe it still got included?
e
export
, as most things in Pants, deals in the transitive closure of dependencies.
So if the VCS dep is in the lockfile, that implies it is in fact a transitive dep and so is being included in the venv export IIUC.
h
It wouldn't be included transitively. I can confirm that when I take it out of
requirements.txt
, it is also removed from the lockfile as expected.
Okay, I confirmed. With my VCS dependency in
requirements.txt
and the lockfile, it shows up in the exported venv. When I remove it and export again, it is not in the venv as expected.
e
OK. I'll admit that at this point I'm not at all tracking what's going on here. If you're continuing to see an unexpected slowdown @high-yak-85899 could you perhaps summarize, maybe in an issue?
h
I think the general summary is just that I'm surprised it takes, at a minimum, 45 seconds to export a venv (especially when none of the requirements have changed). All the noise above was to figure out if it was related to pants versions, the presence of vcs dependencies, and so on. I'm fairly confident that it's not related to having vcs dependencies and that @happy-kitchen-89482 correctly called out the difference between 2.10 and 2.11 since it's constructing a proper venv instead of symlinking.
👍 1
h
since it's constructing a proper venv instead of symlinking.
Given that change, are you still surprised?
h
Yes
If there's no changes, it would be great if it didn't rebuild
👍 1
But I have the luxury of treating caching like it's an easy problem because I don't understand it 🙂
h
Makes sense. FYI I'm pretty sure this is the culprit https://github.com/pantsbuild/pants/pull/15068 -- definitely curious if there is a better way to solve the original problem
h
Well, that has to happen and is not the reason
I doubt it takes 45 seconds just to materialize, it sounds like it's repeating actual resolve work?
👍 1
🤷 2
w
another thread was just started about this with additional context on performance: https://pantsbuild.slack.com/archives/C046T6T9U/p1653336125258069 it seems like the primary thing that influenced this no longer being a symlink to an immutable venv was the desire to be able to
pip install
in the venv. but if that is going to come at this much of a performance cost, it doesn’t seem like it should be the default. “symlink to immutable venv” is pretty safe otherwise.
👍 1