Hello pants Pants community What s the recommended workflow Pants #general

Hello :pants: Pants community! What’s the recommen...

colossal-pilot-94395

05/22/2023, 3:47 PM

Hello 👖 Pants community! What’s the recommended workflow for requirements which have different dependencies in different envorinments? We develop on mac, but our CI (GitHub Actions) runs on Linux. Pytorch is a tough customer in that it requires nvidia packages on linux, but not on mac. We’d like to generate lockfiles locally, and have these subsequently work on CI. The nvidia dependencies don’t show up in the lock files when they’re generated locally. I’ve tried adding them explicitly, as well as hacking them into the lockfile etc, but can’t get CI to happily install Pytorch. Is there a recommended workfile for this? Do we need to have multiple resolves? Is there some other workflow or hack we can do to use a single resolve (would be preferable!). Thanks in advance 🙏🙏

boundless-ambulance-11161

05/22/2023, 4:21 PM

This may be useful to you https://pantsbuild.slack.com/archives/C046T6T9U/p1681901930042989?thread_ts=1664233267.501069&cid=C046T6T9U

🙏 1

happy-kitchen-89482

05/22/2023, 4:26 PM

Ugh pytorch is a repeat offender. Searching this slack for

pytorch

should yield some workarounds.

👍 1

boundless-ambulance-11161

05/22/2023, 4:45 PM

Summary of my solution (I'd like to post the whole code but I have to check with my company, for copyright reasons). We have a

base_requirements.txt

file with the loose requirements (

numpy  >= 1.21.5, < 2.0.0

). Our installation script export it to a lockfile

base_requirements.lock

using directly

pex

if any of those requirements has changed (or we called the script with a

--force

flag). This can be done on any environment, so we make no assumption about which version of torch is in it. To check whether the requirements changed: I read the current lockfile, filter out the header, load the rest with json into a dictionary

lock_d

, compare

lock_d["requirements"]

with

base_requirements.txt

. Our installation script then extract the pinned version of every package (including transitive dependencies), by accessing

lock_d["locked_resolves"][0]["locked_requirements"]

. The version of torch and torchvision need to be parsed because they may contain a "+cpu". It writes three requirements file

linux_cpu.txt

linux_gpu.txt

and

macos_cpu.txt

by dumping the pinned requirements and only changing the versions for

torch

and

torchvision

. Use

pex

to generate the lockfiles for these three requirements (if they have changed compared to the last time they were generated). This part looks like this:

Copy code

requirements_path = REQUIREMENTS_DIR / f"{resolve_name}.txt"
    lockfile_path = LOCKFILES_DIR / f"{resolve_name}.lock"

    pex_cmd = [
        "pex3", "lock", "create",
        "--index-url", "<https://pypi.org/simple/>",
        "--style", "universal",
        "--resolver-version", "pip-2020-resolver",
        "--interpreter-constraint", python_constraint,
        "-r", str(requirements_path),
        "-o", str(lockfile_path),
        "--pex-root", str(pex_root),
        "--jobs", "6",
    ]
    for target_system in target_systems:
        pex_cmd += ["--target-system", target_system]

    for index_url in extra_index_urls:
        pex_cmd += ["--index-url", index_url]

    pex_cmd_str = " ".join(pex_cmd)
    <http://LOG.info|LOG.info>(f"Run {pex_cmd_str}")
    subprocess.run(pex_cmd)

We automatically create symbolic links mapping

default.txt

to either

linux_cpu.txt

linux_gpu.txt

macos_cpu.txt

depending on the platform detected by python and the availability of CUDA according to

nvidia-smi

. Similarly, the script creates

default.lock

to one of the three lockfile. If those links already exist, we don't overwrite them (in case a developer has cuda but wants to work with cpu only, they only need to change the symbolic link once). The path of those 2 symbolic links are added to

.gitignore

to avoid sharing them in git. There may be a better way to do it (I'm still kinda new to pants) and I certainly hope there will be a cleaner/standard way to do it in the near future.

🙏 2

happy-kitchen-89482

05/22/2023, 4:47 PM

Thanks for the summary! We should probably document this on our docs site

👍 1

happy-kitchen-89482

05/22/2023, 4:47 PM

Since people keep hitting it

colossal-pilot-94395

05/22/2023, 5:18 PM

Thanks for the 🧠 dump @boundless-ambulance-11161!! That makes sense. I’ll look to implement something similar tomorrow…

boundless-ambulance-11161

05/22/2023, 7:47 PM

Here is the script we use to deal with pytorch (have different lockfiles for different platform). I got the permission from my CEO to send it with the MIT license. A high level description can be found in my earlier comment in this thread. When solving this problem, I found several high-level descriptions of solutions, but never actual code. I hope it can save time to some. If you find bugs or improve the script, I'd appreciate it if you'd share back. In particular, there is probably a better way to edit the lockfiles or to call

pex

update_lockfiles.py

🙏 2

😻 2

🚀 1

❤️ 1

busy-vase-39202

05/22/2023, 8:56 PM

Thank you for contributing this!

colossal-pilot-94395

05/23/2023, 8:19 AM

@boundless-ambulance-11161 — thank you!!

82 Views

Open in Slack

Previous Next