quaint-oil-51257
10/29/2023, 2:24 PMpytorch==2.1.0
and it's required dependencies in a pex_binary
to run on a linux machine with cuda installed.
# src/x.py
import torch
# src/BUILD
pex_binary(
name="test",
entry_point="x.py"
layout="packed",
)
# 3rdparty/BUILD
python_requirement(
name="torch",
requirements=["torch==2.1.0"],
dependencies=[
"3rdparty:reqs#setuptools",
"3rdparty:reqs#triton",
"3rdparty:reqs#nvidia-cublas-cu12",
"3rdparty:reqs#nvidia-cuda-cupti-cu12",
"3rdparty:reqs#nvidia-cuda-nvrtc-cu12",
"3rdparty:reqs#nvidia-cuda-runtime-cu12",
"3rdparty:reqs#nvidia-cudnn-cu12",
"3rdparty:reqs#nvidia-cufft-cu12",
"3rdparty:reqs#nvidia-curand-cu12",
"3rdparty:reqs#nvidia-cusolver-cu12",
"3rdparty:reqs#nvidia-cusparse-cu12",
"3rdparty:reqs#nvidia-nccl-cu12",
"3rdparty:reqs#nvidia-nvjitlink-cu12",
"3rdparty:reqs#nvidia-nvtx-cu12",
],
)
>>> pants run src:test
ImportError: <SHARED OBJECT NAME>.so.12: cannot open shared object file: No such file or directory
The error indicates that I need to include every shared object in the LD_LIBRARY_PATH
.
However, if I manually created a venv with the specified deps and run python3 -c "import torch; print(torch.cuda.is_available())"
, the script will run as expected and pick up the GPU.
Can provide the full error message if that helps alsobroad-processor-92400
10/29/2023, 11:52 PMcannot open shared object file
in slack reveals https://pantsbuild.slack.com/archives/C046T6T9U/p1682999088726209 that might be relevant. Does that help?