Hi! I'm adding lockfiles on my project but I'm run...
# general
b
Hi! I'm adding lockfiles on my project but I'm running into a problem. My machine is a linux machine, as is the prod environment. Most of my colleagues run on mac. When I generate lockfiles, for
torch
, I get some
linux_x86_64.whl
wheels, which don't work for my colleagues' machines when they are running on docker, which expects
many_linux_2_31_aarch64.whl
wheels. What can we do?
g
Do those wheels even exist? As far as I can tell the closest match is manylinux2014_aarch64, or manylinux_2_17_aarch64 -- I'd expect the latter to work; but they only seem to exist on https://download.pytorch.org/whl/torch/. You might also want to poke around with
pex3 interpreter inspect -m -t
inside that container and see what shows up. And checking if you can install it without involving pants/pex.
b
The problem seems to be that our dev environments aren't uniform. My machine, like the prod machines run on x64 linux, but most of my colleagues run on mac using docker. And because of the M chips, my lockfile (requiring x64) doesn't work for them. I guess I have to convince everyone to switch to linux 🙂
g
But that shouldn't be a problem, Pants lockfiles aren't tied to whoever created it. For example, I generate our locks on a linux machine, but for torch we have
manylinux2014_aarch64
,
manylinux1_x86_64
,
macos_10_9_x86_64
and
macos_11_0_arm64
.
b
Did you need to do anything special? Because when I created the lockfiles for linux it worked fine on my machine and the prod servers, but not for my colleagues running docker compose on mac. The error message complained about the incompatible package (the lockfile had a linux x64 wheel, while their computer wanted a linux aarch64 wheel)
g
Hmm, what files does it list for torch? Is this one of the +abi variants or just base?
b
This is the error we got when using the linux lock I generated on docker for mac.
If I'm reading it correctly, the lock selected a x64 wheel, which is incompatible with arm
g
I think what it's saying is that your lockfile doesn't contain anything it could select because the only thing it can select is x86_64. Which is true, there's no aarch64 variant with a +cpu local version specifier.
These files exist; but aren't selected because of the +cpu constraint:
These are even in the
/cpu/
index you're using, just... without the tag. Because if there's one thing Torch isn't, is consistent.
😅 1
b
Urgh... Thanks... So if I can force it to use the
/cpu/
index it will work for all linux distributions? 🤞
g
Nope! More resolves is what you need unfortunately, probably. Because a local version specifier has highest prio (per PEP-440), anytime pex (really pip) sees one it'll use it. And because pex (via pants) requires the exact same version on all platforms, you'll get a +cpu inferred for mac too. https://github.com/pantsbuild/pants/issues/18965 So for this to work in aarch you need a resolve which does see that index (or another index which has a manylinux+aarch whl) but isn't allowed to pick any local version specifier
b
Another resolve? That will be the 4th: linux, mac, linux_gpu, and now linux_arm? I hate pytorch 😂
g
This is what we use to set that up at my work:
Copy code
torch!=1.12.0+cpu,!=1.12.0+cu116,==1.12.0
specifically excluding all local version specs it could pick
I think mac + linux_arm can be folded, we call it base
Pretty much "everything where torch requires special handling"
b
Ah, I see! That's a great idea!
g
This is what that constraint gets us for aarch64, which I'd imagine would be valid in your docker env too:
(As well as a base set of mac whls)
Funnily enough the wheels you get from torch's own website and the ones from pypi are also different sha256s despite the wheelnames being identical.
b
Funnily enough the wheels you get from torch's own website and the ones from pypi are also different sha256s despite the wheelnames being identical.
Lol. Torch is a mess...
Thanks a lot for your help, I think I can now make resolves work for us. The reduction in build time is substantial!
g
Nice! GLHF 😄 Please let me know if you run into more issues. And feel free to open issues etc if you find actionable things... I've filed a few but hard to find fixes that aren't very explicitly hacks to fix Torch..
😅 1
b
I will, thanks a lot again!