For some reason building pex files on my machine intel macos Pants #general

For some reason building pex files on my machine (...

alert-psychiatrist-14102

02/04/2023, 5:31 PM

For some reason building pex files on my machine (intel macos) is prohibitively slow. Building

requirements.pex

pytest.pex

can take a few minutes and it happens relatively frequently. Example -

Copy code

19:24:15.74 [INFO] Preserving local process execution dir /private/var/folders/1r/vwvwzp312ngc1kfys_7g6x_c0000gp/T/pants-sandbox-uTOWh9 for Building requirements.pex with 15 requirements: av, aws_lambda_powertools, boto3, botocore, ffmpeg_python, frozendict, grpcio, mediapipe, numpy, opencv_python, protobuf, pyarrow, pyzipper, requests, spring_config_client
19:25:31.39 [INFO] Completed: Building requirements.pex with 15 requirements: av, aws_lambda_powertools, boto3, botocore, ffmpeg_python, frozendict, grpcio, mediapipe, numpy, opencv_python, protobuf, pyarrow, pyzipper, requests, s... (19 characters truncated)

Did anyone encounter such slowness? Ideas on how can be improved?

happy-kitchen-89482

02/04/2023, 10:42 PM

I'm assuming you're using a lockfile? What are your interpreter constraints?

alert-psychiatrist-14102

02/05/2023, 6:29 AM

I am. The cost occurs after an update to the requirements.lock. So, even though less frequent, it still takes a lot of time.. Would be great to expedite. Is there a way of not running as pex while developing locally? interpreter -

Copy code

interpreter_constraints = ["==3.9.*"]

alert-psychiatrist-14102

02/05/2023, 6:30 AM

And I also wonder why it rebuilds

pytest_runner.pex

for every test. Aren't they the same?

happy-kitchen-89482

02/05/2023, 6:40 AM

The

pytest_runner.pex

is a "shell" pex that ties together

pytest

itself and your tests' requirements

happy-kitchen-89482

02/05/2023, 6:40 AM

So it changes across tests, because the requirements change

happy-kitchen-89482

02/05/2023, 6:41 AM

but note that it is just a thin wrapper venv, it doesn't actually contain anything, so it is trivial to build

happy-kitchen-89482

02/05/2023, 6:42 AM

When you say "an update to the requirements.lock", are you manually updating that file? Or is

generate-lockfiles

doing so?

happy-kitchen-89482

02/05/2023, 6:42 AM

And is the cost you're talking about incurred during that

generate-lockfiles

or on some subsequent, say,

test

run?

alert-psychiatrist-14102

02/05/2023, 6:45 AM

got it.. sometimes building the

pytest_runner.pex

can take more than 30s for me, so i wonder if i'm doing something wrong. I mean updating through

generate-lockdiles

, and the the cost is incuurred during the susequent

test

run

happy-kitchen-89482

02/05/2023, 6:55 AM

Hmm, so that cost is in the subsetting, which should be fast. And so should

pytest_runner.pex

, so I'm not sure what's going on here.

happy-kitchen-89482

02/05/2023, 6:55 AM

You mentioned "on my machine". Are you not seeing this on other machines?

alert-psychiatrist-14102

02/05/2023, 6:57 AM

Never tried it on another machine..:) What would be the best way to debug it? Any way i can generate a performance report?

happy-kitchen-89482

02/05/2023, 7:01 AM

Are you able to publish a repo to github that reproduces this? It would need your real requirements, and pants.toml, and BUILD files, but could have a dummy test that uses requirements but doesn't expose your real code.

alert-psychiatrist-14102

02/05/2023, 7:04 AM

sure, thanks!

alert-psychiatrist-14102

02/05/2023, 7:05 AM

I'll do it in the next few hours

alert-psychiatrist-14102

02/05/2023, 12:22 PM

the problem reproduces here when trying to run

Copy code

./pants test tests::

happy-kitchen-89482

02/05/2023, 3:57 PM

I get a 404 on that repo

happy-kitchen-89482

02/05/2023, 3:57 PM

Can you make it public?

alert-psychiatrist-14102

02/05/2023, 5:13 PM

Sry, here - https://gitlab.com/neuralight_public/nl_pants

happy-kitchen-89482

02/05/2023, 5:44 PM

OK, I can see the "Building X requirements for requirements.pex from the third_party/python/default.lock" processes taking a long time on a linux box too.

happy-kitchen-89482

02/05/2023, 5:49 PM

Oddly, if I run the tests one at a time (e.g.,

./pants --no-pantsd --no-local-cache test tests/a_test.py

etc.) then that requirements.pex build is fast.

happy-kitchen-89482

02/05/2023, 5:51 PM

Hmm, it's only the first run that was slow. Subsequent runs were much faster (with

--no-pantsd --no-local-cache

, so the work was still really being done)

happy-kitchen-89482

02/05/2023, 6:00 PM

OK, looks like your requirements.pex files are really huge. For test_a for example, it's ~250M. And we're creating separate pexes per test (or, more precisely, per unique subset of requirements).

happy-kitchen-89482

02/05/2023, 6:00 PM

These are your top offenders, for test_a:

Copy code

-rw-rw-r-- 1 benjy benjy  65M Feb  5 17:55 opencv_contrib_python-4.7.0.68-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-rw-rw-r-- 1 benjy benjy  59M Feb  5 17:55 opencv_python-4.7.0.68-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-rw-rw-r-- 1 benjy benjy  35M Feb  5 17:55 pyarrow-10.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-rw-rw-r-- 1 benjy benjy  32M Feb  5 17:55 mediapipe-0.9.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-rw-rw-r-- 1 benjy benjy  30M Feb  5 17:55 av-10.0.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-rw-rw-r-- 1 benjy benjy  17M Feb  5 17:55 numpy-1.24.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
-rw-rw-r-- 1 benjy benjy  12M Feb  5 17:55 matplotlib-3.6.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

happy-kitchen-89482

02/05/2023, 6:01 PM

One thing you can do is set

Copy code

[python]
run_against_entire_lockfile = true

happy-kitchen-89482

02/05/2023, 6:01 PM

in pants.toml

happy-kitchen-89482

02/05/2023, 6:01 PM

This will create one big requirements.pex for your entire lockfile, and run all the tests against that, instead of subsetting

happy-kitchen-89482

02/05/2023, 6:02 PM

It means of course that all your tests will invalidate if any requirement changes, even if the test doesn't care about that requirement

happy-kitchen-89482

02/05/2023, 6:02 PM

That's the tradeoff

alert-psychiatrist-14102

02/05/2023, 6:04 PM

got it! thanks so much for looking into this. Is there an options to set

run_against_entire_lockfile

just for tests? I would not want all my deployments / dockers etc. to become huge bc of this flag

happy-kitchen-89482

02/05/2023, 6:11 PM

The documentation of that option states:

Copy code

This option does not affect packaging deployable artifacts, such as
            PEX files, wheels and cloud functions, which will still use just the exact
            subset of requirements needed.

happy-kitchen-89482

02/05/2023, 6:11 PM

So you're good!

alert-psychiatrist-14102

02/06/2023, 6:54 AM

cool, thx!

adorable-nest-62558

02/14/2023, 5:53 PM

With these heavy packages: https://pantsbuild.slack.com/archives/C046T6T9U/p1675620054923909?thread_ts=1675531860.346739&cid=C046T6T9U Is there no way to make sure run faster? Each file edit always try to rebuild pex:

Copy code

./pants --loop package ::

And it takes on the order of 250 seconds for me. Requirements I am using:

Copy code

ansicolors==1.1.8
setuptools>=56.2.0,<57
types-setuptools>=56.2.0,<58
pytest==7.1.3
fire==0.5.0
requests==2.28.2
types-requests==2.28.11.8
coloredlogs==15.0.1
pandas
transformers==4.26.0
datasets==2.9.0
torch==1.13.1
torchvision==0.14.1
firebase-admin==6.1.0

Copy code

python_requirements(name="reqs")

python_requirement(
    name="fire",
    requirements=["fire==0.5.0"],
)

python_requirement(
    name="ansicolors",
    requirements=["ansicolors==1.1.8"],
)

python_requirement(
    name="requests",
    requirements=["requests==2.28.2"],
)

python_requirement(
    name="pandas",
    requirements=["pandas"],
)

python_requirement(
    name="transformers",
    requirements=["transformers==4.26.0"],
)

python_requirement(
    name="datasets",
    requirements=["datasets==2.9.0"],
)

python_requirement(
    name="torch",
    requirements=["torch==1.13.1"],
)

python_requirement(
    name="torchvision",
    requirements=["torchvision==0.14.1"],
)

python_requirement(
    name="firebase-admin",
    requirements=["firebase-admin==6.1.0"],
)

happy-kitchen-89482

02/14/2023, 6:22 PM

I assume most of that time is due to re-zipping the pexes, since they must contain both your code (which you're editing) and all the large third-party deps

happy-kitchen-89482

02/14/2023, 6:22 PM

./pants --loop package ::

is an unusual idiom though

happy-kitchen-89482

02/14/2023, 6:23 PM

Usually people use

--loop

with

test

fmt

happy-kitchen-89482

02/14/2023, 6:23 PM

And then package once at the end when satisfied

adorable-nest-62558

02/14/2023, 6:24 PM

I was thinking of using

./pants --loop package ::

to create

main.pex

so that I avoid running

./pants run

and direct run via

main.pex

since building is so slow.

adorable-nest-62558

02/14/2023, 6:24 PM

Is there anyway to cache the large third-party deps? Bazel seems to be able to cache, but pants cannot?

happy-kitchen-89482

02/14/2023, 6:24 PM

What are you doing with the resulting pex in that loop?

happy-kitchen-89482

02/14/2023, 6:24 PM

Pants is caching everything

happy-kitchen-89482

02/14/2023, 6:24 PM

but creating a new zipfile takes time, if you have large requirements

adorable-nest-62558

02/14/2023, 6:25 PM

after resulting pex is created, I try to run it

./main.pex

happy-kitchen-89482

02/14/2023, 6:25 PM

How many

pex_binary

targets do you have?

happy-kitchen-89482

02/14/2023, 6:25 PM

So why not do

./pants --loop run path/to/main.py

happy-kitchen-89482

02/14/2023, 6:25 PM

That should be much much faster, since it doesn't have to build a pex

✅ 1

adorable-nest-62558

02/14/2023, 6:26 PM

I just have two. pex_binary.

Copy code

pex_binary(
    name="main",
    entry_point="main.py",
    dependencies=[
        "pipelines:a",
        "pipelines:b",
        "pipelines:c",
        "pipelines:d",
        "3rdparty/python:reqs#pandas",
        "3rdparty/python:reqs#fire",
        "3rdparty/python:reqs#ansicolors",
        "3rdparty/python:reqs#coloredlogs",
    ],
)

adorable-nest-62558

02/14/2023, 6:28 PM

ok yea, it looks like it is much faster. Thx!

✅ 1

happy-kitchen-89482

02/14/2023, 10:29 PM

Great!

6 Views

Open in Slack

Previous Next