Hi. I'm hitting a problem where adding `albumentat...
# general
f
Hi. I'm hitting a problem where adding
albumentations==1.4.8
into requirements.txt is resulting in lockfile generation hanging forever. Anyone else seen this or any tips?
-ldebug
doesn't show what particular step it's hanging on.
b
Sorry for the trouble. As you're aware, the first step might be trying to reproduce this with the "raw" command that pants is invoking, without the pants layers. One approach for doing this might be: 1.
pants --keep-sandbox=always generate-lockfiles ...
2. once it seems to be running a hung process, find it. I'm guessing this'll be a pex process... You might have to do a bit of detective work like
ps aux | grep [p]ex
to find which it is (or maybe start with
grep "[p]ex.*lock.*create"
, on the assumption the hung process is running the
pex3 lock create
lockfile creation command) 3. Induce a failure, e.g.
kill <pid>
4. Jump into the sandbox and work with the
__run.sh
to narrow down the issue
f
Thanks for that! Ok I've done those steps and I'm now running the command in the
__run.sh
which is
Copy code
env -i CPPFLAGS= HOME=/home/matt LANG=en_GB.UTF-8 LDFLAGS= PATH=$'/home/matt/.vscode-server/cli/servers/Stable-5437499feb04f7a586f677b155b039bc2b3669eb/server/bin/remote-cli:/home/matt/.local/bin:/home/matt/miniconda3/envs/base3.10/bin:/home/matt/miniconda3/condabin:/usr/local/cuda-12.1/bin:/usr/local/cuda/bin:/opt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/imanol/.local/bin:/snap/bin:/usr/local/cuda-11.8/bin' PEX_IGNORE_RCFILES=true PEX_PYTHON=/home/matt/.cache/nce/3d6643e46b53e4cc0b2a0d5c768866226ddce3de1f57f80c4a02d8d39800fa8e/bindings/venvs/2.18.0/bin/python3.9 PEX_ROOT=.cache/pex_root PEX_SCRIPT=pex3 SSL_CERT_DIR=/usr/lib/ssl/certs SSL_CERT_FILE=/usr/lib/ssl/certs/ca-certificates.crt /home/matt/.cache/nce/3d6643e46b53e4cc0b2a0d5c768866226ddce3de1f57f80c4a02d8d39800fa8e/bindings/venvs/2.18.0/bin/python3.9 ./pex lock create --tmpdir .tmp --python-path $'/home/matt/.vscode-server/cli/servers/Stable-5437499feb04f7a586f677b155b039bc2b3669eb/server/bin/remote-cli:/home/matt/.local/bin:/home/matt/miniconda3/envs/base3.10/bin:/home/matt/miniconda3/condabin:/usr/local/cuda-12.1/bin:/usr/local/cuda/bin:/opt/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/imanol/.local/bin:/snap/bin:/usr/local/cuda-11.8/bin' $'--output=lock.json' --no-emit-warnings $'--style=universal' --pip-version 23.1.2 --resolver-version pip-2020-resolver --target-system linux --target-system mac $'--indent=2' --no-pypi $'--index=<https://pypi.org/simple/>' $'--index=<https://download.pytorch.org/whl/>' --manylinux manylinux2014 --interpreter-constraint $'CPython>=3.10' Cython $'GPUtil~=1.4.0' accelerate $'albumentations==1.4.8' $'av==10.0.0' $'aws_secretsmanager_caching==1.1.1.5' boto3 $'certifi>=2020.12.5' $'confluent-kafka==1.6.0' $'dacite==1.6.0' $'diffusers==0.28.0' docker $'einops~=0.3.0' $'fastavro==1.4.0' $'ffmpeg-python~=0.2.0' $'ftfy==6.2.0' fvcore h5py $'hydra-core>=1.3.2' imgaug $'jinja2~=3.0.3' $'jsonschema==3.2.0' kornia $'matplotlib~=3.7.1' $'mean_average_precision==2021.4.26.0' $'metaflow==2.7.19' $'mlflow==2.10.0' mmcv-lite $'moto==4.2.4' natsort $'numpy~=1.24.4' nvidia-cublas-cu12 nvidia-cuda-cupti-cu12 nvidia-cuda-nvrtc-cu12 nvidia-cuda-runtime-cu12 nvidia-cudnn-cu12 nvidia-cufft-cu12 nvidia-curand-cu12 nvidia-cusolver-cu12 nvidia-cusparse-cu12 nvidia-nccl-cu12 nvidia-nvtx-cu12 omegaconf $'onnx==1.15.0' $'onnxruntime-gpu==1.15.1' $'opencv-python~=4.7.0.72' $'openmim==0.3.7' $'openpyxl==3.0.10' $'oyaml~=0.9' $'pandas~=1.5.3' parameterized pillow $'progressbar2~=3.51.0' $'psycopg2-binary==2.9.6' pyarrow $'pycuda==2024.1' $'pydantic==2.7.1' pymongo pynamodb pytest-cov pytest-mock $'pytest-reportlog==0.1.2' $'pytest==7.4.0' $'python-json-logger~=0.1.11' $'regex==2023.12.25' requests s3fs $'scikit_image~=0.21.0' $'scikit_learn==1.4.2' $'scipy==1.12.0' $'seaborn~=0.12.2' $'segmentation-models-pytorch==0.3.3' $'semver==2.13.0' $'sentry-sdk==1.45.0' $'setuptools~=69.0.0' $'sqlalchemy==2.0.13' $'tabulate~=0.8.10' $'tensorrt==8.6.1' $'testing.postgresql==1.3.0' timm $'torch==2.1.1+cu121' torch_optimizer $'torchvision==0.16.1+cu121' $'tqdm~=4.45.0' transformers $'triton==2.1.0' $'ujson==5.8.0' $'weasyprint==51' webcolors xformers xlsxwriter
That command hangs. using -vvv shows another command that it appears to hang on. I might investigate that, but i'm not seeing any errors, so clearly whatever is going wrong is being hidden by something
👍 1
I can't run the command used by pex as it complains about
ModuleNotFoundError: No module named '_pex_pip_patches'
which i guess means i'm in the wrong folder for that. So essentially this seems to be pex hanging on this installation.
the gen lockfiles worked before adding teh albumentations requirement, so it seems to be soemthing about that. Repeating the pex command above, with only the albumentations, does work and generates a lockfile. I'm now trying the pex command with a larger timeout but only 1 retry. If that fails i will try without the torch requirement as that's large, and together with albumentations may be causing some sort of timeout
After some trial and error i managed to find some requirements that had to be removed, then it worked. Nothing in the logs to help much with that process though.
at least this is resolved
b
Sorry for the trouble. I agree debugging slow lockfile generation can be hard!