Hello :pants: Pants community! What’s the recommen...
# general
c
Hello 👖 Pants community! What’s the recommended workflow for requirements which have different dependencies in different envorinments? We develop on mac, but our CI (GitHub Actions) runs on Linux. Pytorch is a tough customer in that it requires nvidia packages on linux, but not on mac. We’d like to generate lockfiles locally, and have these subsequently work on CI. The nvidia dependencies don’t show up in the lock files when they’re generated locally. I’ve tried adding them explicitly, as well as hacking them into the lockfile etc, but can’t get CI to happily install Pytorch. Is there a recommended workfile for this? Do we need to have multiple resolves? Is there some other workflow or hack we can do to use a single resolve (would be preferable!). Thanks in advance 🙏🙏
b
h
Ugh pytorch is a repeat offender. Searching this slack for
pytorch
should yield some workarounds.
👍 1
b
Summary of my solution (I'd like to post the whole code but I have to check with my company, for copyright reasons). We have a
base_requirements.txt
file with the loose requirements (
numpy  >= 1.21.5, < 2.0.0
). Our installation script export it to a lockfile
base_requirements.lock
using directly
pex
if any of those requirements has changed (or we called the script with a
--force
flag). This can be done on any environment, so we make no assumption about which version of torch is in it. To check whether the requirements changed: I read the current lockfile, filter out the header, load the rest with json into a dictionary
lock_d
, compare
lock_d["requirements"]
with
base_requirements.txt
. Our installation script then extract the pinned version of every package (including transitive dependencies), by accessing
lock_d["locked_resolves"][0]["locked_requirements"]
. The version of torch and torchvision need to be parsed because they may contain a "+cpu". It writes three requirements file
linux_cpu.txt
,
linux_gpu.txt
and
macos_cpu.txt
by dumping the pinned requirements and only changing the versions for
torch
and
torchvision
. Use
pex
to generate the lockfiles for these three requirements (if they have changed compared to the last time they were generated). This part looks like this:
Copy code
requirements_path = REQUIREMENTS_DIR / f"{resolve_name}.txt"
    lockfile_path = LOCKFILES_DIR / f"{resolve_name}.lock"

    pex_cmd = [
        "pex3", "lock", "create",
        "--index-url", "<https://pypi.org/simple/>",
        "--style", "universal",
        "--resolver-version", "pip-2020-resolver",
        "--interpreter-constraint", python_constraint,
        "-r", str(requirements_path),
        "-o", str(lockfile_path),
        "--pex-root", str(pex_root),
        "--jobs", "6",
    ]
    for target_system in target_systems:
        pex_cmd += ["--target-system", target_system]

    for index_url in extra_index_urls:
        pex_cmd += ["--index-url", index_url]

    pex_cmd_str = " ".join(pex_cmd)
    <http://LOG.info|LOG.info>(f"Run {pex_cmd_str}")
    subprocess.run(pex_cmd)
We automatically create symbolic links mapping
default.txt
to either
linux_cpu.txt
,
linux_gpu.txt
or
macos_cpu.txt
depending on the platform detected by python and the availability of CUDA according to
nvidia-smi
. Similarly, the script creates
default.lock
to one of the three lockfile. If those links already exist, we don't overwrite them (in case a developer has cuda but wants to work with cpu only, they only need to change the symbolic link once). The path of those 2 symbolic links are added to
.gitignore
to avoid sharing them in git. There may be a better way to do it (I'm still kinda new to pants) and I certainly hope there will be a cleaner/standard way to do it in the near future.
🙏 2
h
Thanks for the summary! We should probably document this on our docs site
👍 1
Since people keep hitting it
c
Thanks for the 🧠 dump @boundless-ambulance-11161!! That makes sense. I’ll look to implement something similar tomorrow…
b
Here is the script we use to deal with pytorch (have different lockfiles for different platform). I got the permission from my CEO to send it with the MIT license. A high level description can be found in my earlier comment in this thread. When solving this problem, I found several high-level descriptions of solutions, but never actual code. I hope it can save time to some. If you find bugs or improve the script, I'd appreciate it if you'd share back. In particular, there is probably a better way to edit the lockfiles or to call
pex
.
🙏 2
😻 2
🚀 1
❤️ 1
b
Thank you for contributing this!
c
@boundless-ambulance-11161 — thank you!!