hoping to get some insight here: running a script ...
# general
h
hoping to get some insight here: running a script with 21 dependencies and pants is 20+ minutes and counting to build the pex file. i’ve deleted pants and pip cache, and also asked someone else to try on their machine, and they seem to be facing the same problem.
Copy code
⠓ 1462.31s Building main.pex with 21 requirements: arn<0.2.0,>=0.1.5, backoff<2.0.0,>=1.10.0, boltons<21.0.0,>=20.2.1, boto3<2.0.0,>=1.15, botocore<2.0.0,>=1.21.60, click<9.0.0,>=8.1.3, fsspec<2022.0.0,>=2021.4.0, orjson<3.7.0,>=3.6, pandas<2.0.0,>=1.5.3, protobuf<4,>=3.19
here’s the list of 3rd party dependencies
Copy code
./pants dependencies --transitive src/my_script:main
//:poetry#arn
//:poetry#backoff
//:poetry#boltons
//:poetry#boto3
//:poetry#botocore
//:poetry#click
//:poetry#fsspec
//:poetry#orjson
//:poetry#pandas
//:poetry#protobuf
//:poetry#psycopg2-binary
//:poetry#pulumi
//:poetry#pulumi-aws
//:poetry#pydantic
//:poetry#ratelimit
//:poetry#redis-py-cluster
//:poetry#requests
//:poetry#s3fs
//:poetry#sentry-sdk
//:poetry#setuptools
//:poetry#tomli
//pyproject.toml:poetry
...
is there something i can/should do differently for pants to run this in a more reasonable timeframe?
it seems like the issue might be coming from
//:poetry#fsspec
and/or
//:poetry#s3fs
which were added as dependencies of pandas
c
I would look at bit closer at what exactly pants is doing. Possibly it is compiling a wheel for
pandas
or the underlying dep
numpy
. I find that cam take forever. Is this the first time you have run it? It will cache these dependencies, and memoize previously run tasks that are relevant
h
this was running in a normal amount of time until i added
Copy code
poetry_requirements(
    name="poetry",
    overrides={
        "pandas": {
            "dependencies": ["//:poetry#fsspec", "//:poetry#s3fs"],
        },
    },
)
to our root BUILD file
and actually just noticed that the tests we have in CI are also still building the pex file after 25 minutes 😅
r
It's boto3, botocore and s3fs most probably. Make sure all of them have more or less matching/ similar transitive dependencies version
🫡 1
e
@cool-yacht-37128 has the right idea here. You really want to pre-build wheels and host them yourself when possible to save all developrs the time of building platform-specific sdists into wheels. Not only can that be slow, but if you multiply by CIs + devs - its ~needlessly slow.
h
hmm, that makes sense
fwiw, poetry is able to resolve the dependencies of our entire repo (including these packages) in about a minute
e
With a blown away cache and no venvs? Thats the apples-to-apples.
h
fair point, not sure
e
When you do that, folks tend to find Pex - really Pip, is faster. To level the playing field though I have https://github.com/pantsbuild/pex/issues/2044
g
boto3
and
s3fs
don’t play very nicely together for pip resolving - it downloads and tries every version of boto before finding one that works. Instead, specify
s3fs
and
aiobotocore[boto3]
and remove
boto3
entirely https://pantsbuild.slack.com/archives/C046T6T9U/p1668528451161329?thread_ts=1668522633.080329&amp;cid=C046T6T9U
❤️ 2
🧠 1
h
thanks, i’ll give that a try 🙂
g
I’ve run into that problem so many times, I use S3FS and boto3 on most of my projects. I forget where I found that solution originally - but it’s worked great for me. Let me know how that works
Actually, even easier, you can just specify
s3fs[boto3]
which uses
aiobotocore[boto3]
under the hood and will easily resolve which version of boto3/botocore to use https://github.com/fsspec/s3fs/blob/56235491bdc3a89af22fe73a5440b90d1a65ce01/setup.py#L39
h
btw, i had to kill the job in CI after over 2 hours of pants trying to build the PEX file
i know it’s not an apples-to-apples comparison, but poetry was able to create the lock file in around a minute
happy to dig in to help figure out what’s going on if i can
e
@high-energy-55500 https://github.com/pantsbuild/pants/pull/17555 introduced the ability to say:
Copy code
[python]
pip_version = "22.3"
That tells Pex to use Pip 22.3 instead of its default 20.3.4. That Pip is 2 years newer and much faster for many lock resolves. You'll have to use Pants 2.16.x to get it though, latest of which is https://pypi.org/project/pantsbuild.pants/2.16.0.dev6/ Besides that, and tightening / adding
interpreter_constraints
(Pants default is `>=3.7,<4`which is a huge range that can slow down lock solutions) the only real solution is to invest in https://github.com/pantsbuild/pex/issues/2044
🙌 2
happy to dig in to help figure out what’s going on if i can
Thank you. This figuring out has already been done and https://github.com/pantsbuild/pex/issues/2044 is the solution.
👍 2