The stricter you get in terms of python package ve...
# general
r
The stricter you get in terms of python package version i.e. pinning exact version vs using something like
~major.minor
, longer it takes to generate lockfiles. What’s the recommendation here?
Even switching from something like
~major.minor
to
~major.minor.patch
leads to never ending lockfile generation.
I wish it would fail faster 😕
c
I'd love a pants opinionated guide on how best to constraint/manage/debug 3rd party python resolution. A very large fraction of my time "getting pants setup" is really "hmm it has been half an hour /again/, I guess these lockfiles are not going to resolve with these versions". My naive assumption would be that narrowing the possibilities with stricter pinning makes the resolution easier/faster, but perhaps that does opposite!
e
The only narrowing that really helps speed is narrowing interpreter constraints. The default is almost certainly not what you want. That's
>=3.7,<4
and is a huge range to solve for. When setting an IC though, beware: https://github.com/pantsbuild/pants/issues/17978 You currently must repeat yourself.
In bleeding edge Pants you can set
[python] pip_version = "22.3"
to get generally faster resolves
The only real solution is here though: https://github.com/pantsbuild/pex/issues/2044
c
The only narrowing that really helps speed is narrowing interpreter constraints
With interpreter constraints versions
x.y.z
, would narrowing the
z
be expected to speed it up meaningfully? (I know there are "fun" cases where pypi packages are picky about the
z
)
e
The fun cases will fail the resolve, so those are required narrowings in those cases. In general you should not need to care about the patch version. If you only use 3.10 though, an IC of
==3.10.*
should speed lock resolves with the caveat issue above, it's not that easy to set the IC!
If you have an old lock today, you can read it and find the IC it used.
If it's not what you expect, follow https://github.com/pantsbuild/pants/issues/17978 closely to get it right.
c
I see
Copy code
// --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE ---
// {
//   "version": 3,
//   "valid_for_interpreter_constraints": [
//     "CPython==3.10.*"
//   ],
which matches what I expect (one 'version' which is 3.10)
e
Ok, then the only short term maybe remedy is 2.16.0.dev5+ and pip_version.
Even if you're not ready for that Pants upgrade, you might try it just to get a timing comparison.
c
Maybe stepping back to mental models and expectations. As I have my work in progress right now,
pip install -r all-the-deps.txt
it takes about 2 minutes, while
./pants generate-lockfiles --resolve=default
(2.16.0.dev5+ and pip_version; single interpreter version) is still spinning at 15 minutes and I don't really expect it will ever finish based on when I tried something similar last. I know pants/pex is doing "more" in a sense (the lockfile should work for both macosx and linux, checking interpreter constraints) and it is fine it it takes somewhat longer to be stricter/better. But the difference in runtime seems excessive for what is naively a similar operation. Is that magnitude of a difference in resolution typical, a nonviable comparison, or something that I should clean up into a ticket? @refined-addition-53644 Sorry to hijack your thread!
e
I think the Pex ticket addresses your questions: https://github.com/pantsbuild/pex/issues/2044 In short, Pants is hermetic. Each resolve starts from a clean slate today. The Pex ticket outlines a technique that makes this faster.
c
I might be misunderstanding the bounds of the hermetic seal. I though that
generate-lockfiles
would still use a shared
.cache/pants/named_caches/pex_root/pip_cache/
which was akin to
.cache/pip
. My mental model is that in the case of both
pip install
in a fresh virtualenv and
./pants generate-lockfiles
(today) the cpu burning resolve starts from scratch but some sdists/wheels/tarballs are cached from previous runs. In other words if I naively expect those operations to take the same order of magnitude time, the difference isn't explained by downloading artifacts afresh each time. (#2044 looks like it would make the 2nd resolve much faster -- which is great! -- but I'd still need the 1st to finish.)
e
It does use that cache, but it's as if you created a fresh venv to run Pip in with 0 deps pre-installed.
@curved-manchester-66006 if you want me to take a look at your case for 1st time lock it would be great to have your data. Requirements list + ICs, original lockfile if there was one, requirements diff if there was one, etc. This would best be filed as a Pex issue, not Pants and in contrast to https://github.com/pantsbuild/pex/issues/2036 (which is an excellent bug report for perf concerns) would be, presumably, about a non-pinned non-complete set of input requirements since that issue covers that case.
c
Thank you! I appreciate it. I actually spent the morning trying to make a more minimal shareable case... which showed
pex lock
was dramatically faster than
pip install
¯\_(ツ)_/¯ I suspect I'm running into some pep resolver fun where multiple `-r`s or, the ordering of dependenies within files tickles the resolver's heuristics. But that's not a pants/pex problem yet!