I wonder if there are any updates / things in the ...
# general
a
I wonder if there are any updates / things in the works for reducing build times of large requirements files. I know this issue came up a few times in this channel.. I, for instance, use pants in a large project that has 60 external requirements (some of them are huge, like pytorch). Building a subset requirements.pex per target can take 5-15 mins. I acknowledge this is a lot of requirements, but still this very slow (compared to pip for example). Similar question about
generate-lockfiles
.
👀 2
f
joining the question
a
vice versa:)
or, is there a way to cache these using a remote cache solution?
f
are you also using pex with docker env?
a
nope, running plain
python_source
targets on macos
e
The 1st question is more difficult to guess at since there are two major factors: Pants sandbox materialization time and Pex subsetting time. The 2nd question is cleaner: the issue is with Pex and the solution is known: https://github.com/pantsbuild/pex/issues/2044
a
Thanks! Re first - the time is spent predominantly on pex sunsetting (by looking at logs, and running the pex command manually)
e
Ok. For the 1st an actual lock file + subset requirements list (basically the full input to the Pex command) would be useful in the form of a Pex issue to explore speed ups. When last I checked the subset calculation was ~500ms for large lockfiles and large subsets (worst case) and the time was dominated by io creating the subsequent PEX from that. Now Pants uses --layout packed which should noop for every "wheel" in the subset and get it from the cache, but that's worth drilling in on with a real example. A thing that is critical to note will be the locations of sandboxes (default/tmp) and named_caches (default ~/.cache/pants/named_caches). If these are on different filesystems then default hard links fall back to copies which, of course, is expensive. Either way though packed PEX construction is not parallelized currently and could be.
a
Pex command (lockfile attached) -
Copy code
.../bin/python ./pex --tmpdir .tmp --jobs 8 --pip-version 22.3 --python-path /Users/vova/.pyenv/versions/3.9.7/bin:/usr/local/Cellar/pyenv-virtualenv/1.1.5/shims:/Users/vova/.pyenv/shims:/Users/vova/.pyenv/bin:/opt/pycharm-2021.2.3/bin:/opt/idea-IU-212.5284.40/bin:/Users/vova/go/bin:/Users/vova/.local/lib/go/bin:/Users/vova/.local/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin --output-file main.py.pex --no-emit-warnings --no-strip-pex-env --requirements-pex local_dists.pex --venv --seed verbose --no-venv-site-packages-copies --python /usr/local/Cellar/python@3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/bin/python3.9 --entry-point nl_python.main --sources-directory="source_files av aws_lambda_powertools==1.25.7 boto3 botocore==1.29.50 datapane==0.15.4 deepdiff dirsync fastparquet ffmpeg_python flask flask_cors flask_socketio frozendict gitpython grpcio jinja2 joblib jsonpath_ng jupyter jupyterlab matplotlib mediapipe==0.8.9.1 mss mypy_boto3_s3 numpy opencv_python pandas persist_queue pillow plotly protobuf psutil pyarrow pygments pynput pyobjc_framework_cocoa; platform_system == \Darwin\ pytest python_daemon python_dateutil pytorch_lightning pytz pyyaml pyzipper remodnav requests sagemaker scikit-learn scikit_learn scipy seaborn setuptools slack_sdk spring_config_client streamlit streamlit-ext toml torch>=1.13 torchvision tqdm watchdog xgboost" --lock third_party/python/default.lock --no-pypi --index="<https://pypi.org/simple/>" --manylinux manylinux2014 --layout packed
e
I don't have access to
local_dists.pex
and so I removed that option entirely. I also don't have sources; so removed that option + the entrypoint.
1st run after
rm -rf ~/.pex
for a cold cache:
Copy code
...
Needed cp39-cp39-manylinux_2_35_x86_64 compatible dependencies for:
 1: nvidia-cuda-nvrtc-cu11==11.7.99; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cuda-nvrtc-cu11', normalized='nvidia-cuda-nvrtc-cu11') distributions.
 2: nvidia-cuda-runtime-cu11==11.7.99; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cuda-runtime-cu11', normalized='nvidia-cuda-runtime-cu11') distributions.
 3: nvidia-cuda-cupti-cu11==11.7.101; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cuda-cupti-cu11', normalized='nvidia-cuda-cupti-cu11') distributions.
 4: nvidia-cudnn-cu11==8.5.0.96; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cudnn-cu11', normalized='nvidia-cudnn-cu11') distributions.
 5: nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cublas-cu11', normalized='nvidia-cublas-cu11') distributions.
 6: nvidia-cufft-cu11==10.9.0.58; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cufft-cu11', normalized='nvidia-cufft-cu11') distributions.
 7: nvidia-curand-cu11==10.2.10.91; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-curand-cu11', normalized='nvidia-curand-cu11') distributions.
 8: nvidia-cusolver-cu11==11.4.0.1; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cusolver-cu11', normalized='nvidia-cusolver-cu11') distributions.
 9: nvidia-cusparse-cu11==11.7.4.91; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cusparse-cu11', normalized='nvidia-cusparse-cu11') distributions.
 10: nvidia-nccl-cu11==2.14.3; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-nccl-cu11', normalized='nvidia-nccl-cu11') distributions.
 11: nvidia-nvtx-cu11==11.7.91; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-nvtx-cu11', normalized='nvidia-nvtx-cu11') distributions.
 12: triton==2.0.0; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='triton', normalized='triton') distributions.
Failed to resolve requirements from PEX environment @ /home/jsirois/.pex/unzipped_pexes/f4fdcfb4fe4751b027192c5811503b32ee7743be.
Needed cp39-cp39-manylinux_2_35_x86_64 compatible dependencies for:
 1: nvidia-cuda-nvrtc-cu11==11.7.99; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cuda-nvrtc-cu11', normalized='nvidia-cuda-nvrtc-cu11') distributions.
 2: nvidia-cuda-runtime-cu11==11.7.99; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cuda-runtime-cu11', normalized='nvidia-cuda-runtime-cu11') distributions.
 3: nvidia-cuda-cupti-cu11==11.7.101; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cuda-cupti-cu11', normalized='nvidia-cuda-cupti-cu11') distributions.
 4: nvidia-cudnn-cu11==8.5.0.96; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cudnn-cu11', normalized='nvidia-cudnn-cu11') distributions.
 5: nvidia-cublas-cu11==11.10.3.66; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cublas-cu11', normalized='nvidia-cublas-cu11') distributions.
 6: nvidia-cufft-cu11==10.9.0.58; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cufft-cu11', normalized='nvidia-cufft-cu11') distributions.
 7: nvidia-curand-cu11==10.2.10.91; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-curand-cu11', normalized='nvidia-curand-cu11') distributions.
 8: nvidia-cusolver-cu11==11.4.0.1; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cusolver-cu11', normalized='nvidia-cusolver-cu11') distributions.
 9: nvidia-cusparse-cu11==11.7.4.91; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-cusparse-cu11', normalized='nvidia-cusparse-cu11') distributions.
 10: nvidia-nccl-cu11==2.14.3; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-nccl-cu11', normalized='nvidia-nccl-cu11') distributions.
 11: nvidia-nvtx-cu11==11.7.91; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='nvidia-nvtx-cu11', normalized='nvidia-nvtx-cu11') distributions.
 12: triton==2.0.0; platform_system == "Linux" and platform_machine == "x86_64"
    Required by:
      torch 2.0.1
    But this pex had no ProjectName(raw='triton', normalized='triton') distributions.

real    19m42.430s
user    9m29.931s
sys     0m49.984s
SO thats 20 minutes cold with the post-processing validation failing. I treated that post-processing as inconsequential for these purposes and used
--ignore-errors
for a 2nd run:
Copy code
$ time ./repro.sh
{"pex_root": "/home/jsirois/.pex", "python": "/home/jsirois/.pyenv/versions/3.9.17/bin/python3.9", "pex": "/home/jsirois/.pex/venvs/46e3bb631036d8dfab394bac2eaff7abdc121019/532d53e33a68b7f477bce2fd4c5178ae3308162f/pex"}

real    0m5.745s
user    0m4.871s
sys     0m0.696s
This is typical, always ~6s. With
PEX_VERBOSE=3
the breakdown shows as:
Copy code
pex:     Resolving requirements from lock file default.lock: 4567.8ms
pex:       Parsing requirements: 9.6ms
pex:       Resolving urls to fetch for 61 requirements from lock default.lock: 80.2ms
pex:       Hashing pex: 17.8ms
pex:       Isolating pex: 0.1ms
pex:       Downloading 248 distributions to satisfy 61 requirements: 120.4ms
pex:       Categorizing 248 downloaded artifacts: 1.2ms
pex:       Building 8 artifacts and installing 248: 4335.0ms
That's as expected - the time is dominated by the install (creation of the PEX post resolve / download / build sdists which are all cache hit ~noops).
So ... I'm interested in the
--requirements-pex local_dists.pex
bit, which seems to be the major difference here.
h
You can also sidestep the subsetting entirely by setting this option to true: https://www.pantsbuild.org/docs/reference-python#run_against_entire_lockfile
Of course then your tests all depend on the entire lockfile, so any lockfile change will invalidate even tests that didn't actually use the changed requirements
But it may be interesting to at least try that out
n
@enough-analyst-54434 Have you had the chance to resolve that issue?
e
https://github.com/pantsbuild/pex/issues/2044? Clearly not! It's still open. I haven't even started work (I religiously assign myself to an issue and label it in progress if I'm working on it. So I have not even begun working on it either).
👍 1