Hi <@U051221NF>, our constraints.txt has been reso...
# general
a
Hi @happy-kitchen-89482, our constraints.txt has been resolving really, really slow lately. Is there any way to speed it up? This is probably mostly a pip resolver issue, but wondering if you happen to know some tricks. Thanks! we are using pants version 2.6.0 and pip version 19.3. cc @dazzling-diamond-4749
m
you can start upgrading pip 🤔
image.png
a
we intentionally downgraded to this version bc it was faster than the latest..https://github.com/pypa/pip/issues/9187
🤔 1
h
Hmm, first thing to check is that Pants isn't making the performance worse somehow. Do you see the same slowness if you run pip directly?
👍 1
a
which pip command should I use that's equivalent to pant's "resolving constraints"?
Or just
pip install -r constraints.txt
?
h
basically
you can see the exact command if you run Pants with
--no-process-execution-local-cleanup
it'll log every sandbox it creates
And you can find the one for the pex run for resolving constraints.txt
(pex invokes pip)
And in that sandbox is a script containing the exact command line that was run
but actually what we want to compare it to is the raw
pip install -r constraints.txt
So yeah, just that 🙂
a
do i need to force reinstall?
h
To bypass the cache?
1
a
raw
pip install -r constraints.txt
(without force reinstall) took 342s, pants resolve constraints usually takes 300-400s
h
Try
--no-process-execution-local-cache
to force stuff to rerun
OK, so Pants is not making things worse
a
yah
h
It's literally just that your constraints.txt takes a long time to resolve
In fact Pants should make things better on average, thanks to caching
I don't really know of any tips to speed things up, other than to figure out if there is one requirement that is particularly expensive and seeing if you can remove it
😅 2
You can possibly get verbose logs from pip to see where time is being spent
a
gotcha.. thanks Benjy!
h
A good optimization is seeing if you have any sdist requirements because those need to be manually built. Compared to bdist wheels, which are prebuilt for you and only need to be downloaded. Sdists are often particularly painful to build and require compiling native code To do that, look at your pip log when it says something "building PyYAML". Sometimes you can upgrade to a newer version of the dependency if they started releasing .whl files, which you can see on the PyPI page: https://pypi.org/project/requests/#files. If they still don't have wheels, it may be worth prebuilding them and hosting them internally
🙏 1
a
we don't currently have
sdist
in the pants repo
also didn't find
building PyYAML
in the pip logs, but thanks a lot for the suggestion @hundreds-father-404
h
To clarify, sdist is not the name of the dependency. It's a type of dependency, and stands for "source distribution". Unlike a "bdist" (built distribution), all you get is the raw source code like
.py
files and instructions to build it, often which means compiling C code. https://medium.com/ochrona/understanding-python-package-distribution-types-25d53308a9a Do you still have your pip log the earlier
pip install
? I can help see if you have any sdists
a
Yah, understood.. I was saying we don't build any sdist in our pants repo, since the
python-distributions
target would also support that.
h
I was saying we don't build any sdist in our pants repo
Got it. It's fairly common for your third-party requirements to only be published with sdists. Even if your own
python_distribution
target isn't built with
sdist
, the third-party requirements you use might be. Sometimes it's from a transitive dependency. It is totally possible all your deps are released as bdist wheels, but also not very common fwict It looks like this in your pip logs when you need to build an sdist:
Copy code
Building wheels for collected packages: setproctitle
  Building wheel for setproctitle (PEP 517) ... done
  Created wheel for setproctitle: filename=setproctitle-1.2.1-cp39-cp39-macosx_11_0_arm64.whl size=10724 sha256=2b8228033e093d0c07e5b3bac321274e11aa51f15a435606fe95671c7fe5964d
  Stored in directory: /Users/ericarellano/Library/Caches/pip/wheels/c3/e4/78/85a456b48a3f8ecd33b4cd1b1dfd3ec0ac25ae6d498a86bf65
Successfully built setproctitle
1
a
this is saying it's a wheel though? it's kind of confusing. What is the unique identifier of building a sdist? just when it says
building xxx
?
if that's the case then I have a ton of them
in total 24
h
Definitely agreed on it being confusing! When you depend on an sdist, pip will first have to build that sdist into a wheel. The end result is the same: a
.whl
file. The difference is whether that
.whl
was built already for you vs. if you have to build it locally yourself. So you can skip an entire step of the install stage To figure out why that dependency doesn't have a prebuilt wheel, you can go to its PyPI page. In my example log, I was installing setproctitle 1.2.1. So first I go to https://pypi.org/project/setproctitle, which I found via PyPI's search bar. Then, I see the version at the top of the page is at 1.2.2 and I want to look at 1.2.1. Click "release history" and choose 1.2.1. Now, click "Download files" tab to get to https://pypi.org/project/setproctitle/1.2.1/#files. There, I can see there are a couple
.whl
files and also
setproctitle-1.2.1.tar.gz
, which is the sdist. For some dependencies, there won't be any
.whl
files. Other times, there will be some but not the ones you need - here,
setproctitle
is only releasing
.whl
files that work with Linux, and there's no
macos
or
osx
in the file names, which is why I have to build a wheel on my mac. For each of those 24 sdists you depend on, your options are: 1. Stop depending on it 2. Do nothing. Some sdists are much slower to build than others 3. See if a newer release of that dependency has
.whl
files on its PyPI page. If so, upgrade 4. Pre-build that dependency into
.whl
files and host the files internally, such as setting up a server or checking them into Git. You can instruct Pants and pip to install both from PyPI and your prebuilt wheels
🙌 1
a
🙏 This is immensely helpful. Thank you @hundreds-father-404!
❤️ 1
w
also, fwiw: when you hit Pants’ process cache (either local or remote), you won’t invoke PIP at all
in https://www.pantsbuild.org/docs/using-pants-in-ci#directories-to-cache, the local process cache is the one named
Copy code
~/.cache/pants/lmdb_store
a
Actually it looks like on my local machine it's always resolving constraints which takes forever
not sure what's going on that makes it miss cache
d
Would you recommend just caching the wheels in LFS? That way we can lock packages better than a constraint file, and resolve fast.
w
you could, yea. but between the two, caching is probably more beneficial to have working, because it also caches everything else, including your test runs
@ambitious-student-81104: to debug processes running when you don’t expect locally, you can run with
-ldebug
, which will dump some information each time a process is invoked
1
you’ll get a line like
spawned local process as Some(47375) for Process
, which contains ~everything that goes into the cache key for the process: if for two different runs you see different processes being used to build from your constraints file, that would be very fishy!
1
d
Ya, because we parallelize tests, because we can't bump up core count on our Build kite agents. We probably need a remote cache. :( We will have to figure that out later. (Something something security review)
w
in the mean time, folks have had some luck using their constraints.txt as a cache key in their CI providers (Github actions in particular, not sure what Buildkite’s API looks like) and storing the entire local cache under it