Hm thought this was obvious but while I ve found a workaroun Pants #general

Hm, thought this was obvious but while I've found ...

echoing-farmer-15630

05/05/2022, 12:55 PM

Hm, thought this was obvious but while I've found a workaround I'd like clarification. We use one package,

jax

, that needs some extras in most contexts (

jax[cpu]

) and in one particular context needs different extras (

jax[mystical_cuda_requirements]

). Honestly this should be true for

torch

as well but the cuda/cpu builds there are hellacious (and this is why we can't have nice things like sub-2GB containers). Anyway. We have pinned versions of packages in a constraints file, but at least a while ago you couldn't put extras in there (haven't tried again under the new resolvers). I generate a

python_requirements

based on that constraints.txt file, but if I have

jax

in the constraints file, it creates a target without any extras which does no one any good. I can't figure out how to have extras in a

python_requirements

rule, so I have a separate requirement as

Copy code

python_requirement(
    name="jax", requirements=["jax[cpu]==0.3.10"], modules=["jax", "jaxlib"]
)

...and my guess for when I need the cuda version is to create a

name="jaxcuda"

which also provides the same modules and resolve in the necessary target. Maybe. Is there a clean way to handle this sort of situation? I scanned the documentation but got mildly confused... I'd really prefer a "use the CPU unless specifically declared otherwise" rather than two separate "jax" targets which have to be resolved for every client target if we can...

bitter-ability-32190

05/05/2022, 12:57 PM

Are you able to upgrade to Pants 2.11 and try out the PEX-based lockfiles? I can confirm they handle extras.

bitter-ability-32190

05/05/2022, 12:57 PM

https://blog.pantsbuild.org/introducing-pants-2-11/

echoing-farmer-15630

05/05/2022, 1:10 PM

Hm... am using 2.11 now actually. Excellent and missed that in the patch notes. Will give it a go shortly.

echoing-farmer-15630

05/05/2022, 1:28 PM

Yep, that did the trick; many thanks. Any suggestions on the "changed extra" there when I need the cuda requirements in only some cases? Do I make a secondary target with a separate

python_requirement

and list that in the dependencies for the one target which needs cuda?

bitter-ability-32190

05/05/2022, 1:28 PM

(The PEX lockfiles are also much faster to consume, enjoy the extra perf 😉 )

bitter-ability-32190

05/05/2022, 1:29 PM

Any suggestions on the "changed extra" there when I need the cuda requirements in only some cases?

I'm told that's a key feature for

paramterize

but admittedly I haven't put all the pieces together in my head.

echoing-farmer-15630

05/05/2022, 1:30 PM

That performance part has been nicer, yes. We were actually already using the pex resolver, but with a "cleaned" constraints.txt that didn't have the extras. I'll see if I understand parameterize; the description doesn't quite match what I need (although I don't need it yet)

bitter-ability-32190

05/05/2022, 1:30 PM

Yeah let me ping @hundreds-father-404 they're wiser than me here

echoing-farmer-15630

05/05/2022, 1:33 PM

ah well, damn; I forgot that later on in the cycle we install via

pip ... -c constraints.txt

and so pip still doesn't handle that. Eh, I'll do my workaround for now (I can't use pexes for docker builds because some of my folks use macs and I don't have the bandwidth right now to generate requirements.txt files via pants for each dockerfile to use). Thanks, though, will make notes that this is handled.

bitter-ability-32190

05/05/2022, 1:34 PM

PEX itself allows you to transform to/from

.txt

pip-style files.

bitter-ability-32190

05/05/2022, 1:34 PM

pex3 lock create

I think

echoing-farmer-15630

05/05/2022, 1:36 PM

Fair and good to know!

hundreds-father-404

05/05/2022, 2:56 PM

The solution for toggling between GPU vs CPU is not super fun 😞 especially if you want to change everything at a global level, rather than per-binary/per-project basis Naively, this is where "multiple resolves" (aka lockfiles) comes in. You will have two resolves that are identical in every way except for CPU vs GPU. Then, every target that should work with both will look like this:

Copy code

python_sources(
   resolve=parametrize("python-gpu", "python-cpu"),
)

-- There is another workaround which I honestly might suggest you do...just thought of this one. In your BUILD file, have a

python_requirement

target for the CPU version and a different one for GPU. Comment out whichever one you are not using You can maintain two lockfiles in a kind of hacky way, update

[python].resolves

to point to

python.cpu.lock

python.gpu.lock

, for example. When you want to change, you'll comment out the target you don't want, and update

[python].resolves

pants.toml

There are some ways we could make that slightly less hacky, like you write a target generator that will inspect

[python].resolves

option and decide based on that whether to give you the CPU or GPU version I don't love how hacky this all is, but might honestly be better than multiple resolves. Multiple resolves are super powerful, but do have additional cognitive overhead https://www.pantsbuild.org/docs/python-third-party-dependencies#multiple-lockfiles

echoing-farmer-15630

05/05/2022, 4:05 PM

Interesting. Multiple resolves ARE super powerful. I actually don't need a "global switch" -- what I'm wanting to do is have a "global default except for this dockerfile which uses the GPU version" idea. Since I'm not actually generating the Dockerfiles through Pants (ie not using the pex mechanism) I'm probably going to brute-force this by having a line in the dockerfile which just installs the GPU version, which will work if we need it--good to know what other options are available. What I was trying to avoid is "two different jax target definitions and now every time something needs jax you need to specify which one"

✅ 2

hundreds-father-404

05/05/2022, 4:07 PM

"two different jax target definitions and now every time something needs jax you need to specify which one"

That is a major benefit of multiple resolves: no more "ambiguous dependency" warnings! Pants only infers deps on things that share the same resolve But yeah, users would need to mark whether something is

resolve=parametrize("cpu", "gpu")

vs just one of those two, which is a pain. (My priority this week is improving error message for that, at least)

7 Views

Open in Slack

Previous Next