Hi all, I'm trying to work with some 3rd party pyt...
# general
p
Hi all, I'm trying to work with some 3rd party python dependencies that provide sdists, but cannot be built in isolation, as they require undeclared build dependencies. The main problem is with detectron2, which requires some
torch
version to be installed in the user's venv, in order to link against it. Besides radical solutions like patching the sdist, is there a way to tell pants to inject a dependency in the build environment of a package? In runtime, anything missing can be specified using
python_requirement(..., dependencies=[...])
(which is a huge help due to other problematic packages!), however there is no
build_dependencies
or similar argument where I could specify
":requirements#torch"
. Does any solution/workaround come to mind?
l
What about creating a venv separately with torch already installed, install dectectron2 and just copy the resulting wheel into the repository? you would need then to configure python-repos.find_links
p
That is definitely possible, but each user would have to do it themselves because of other linked libraries, unless we centrally build a
manylinux
wheel for it. However it's a bit cumbersome and in the general case requires automation of such wheel builds. Our current workaround is a small patch to the sdist, which adds the build dependency (PEP-517). Even though the sdist works for all, since it's not yet linked to any libraries, it's still an additional step.
l
yes, in that case then the patch sounds like the way to go
p
By the way, I had a similar discussion some time ago with @enough-analyst-54434 about the horovod package, which also has undeclared build dependencies, but causes more subtle errors here. Although it should be the responsibility of the packages to declare dependencies correctly, users don't really have much leverage if the maintainers don't accept or merge pull requests, so it would be nice to be able to handle this with an option in pants.
e
@purple-umbrella-89891 have you considered and rejected forking and leveraging PEP-518 by plopping in a pyproject.toml with a
[build-system] requires = [...]
? You'd then just switch to a VCS requirement.
You'd maintain a 2-line delta fork.
FWIW this could not be done in Pants, it would have to be done in Pex.
The Rust / Cargo world is very used to this! The Python ecosystem is not yet, but its very useful.
p
You're right that a fork with a
pyproject.toml
would also work. My sdist patch is the same and the two solutions are more or less equivalent functionally. I opted to use sdists with a local version identifier, e.g.
0.6+2c6c380f94
, as detectron2 has not updated the version for years and it is useful to have the git hash of the public repo, instead of the changed hash after applying the patch in the fork.
e
Ok, you definitely should be forking, sending a PR, but unblocking by using the VCS requirement until they finally accept the patch.
Win / win, and you're doing the right thing by everybody.
This is the pattern in the Rust / Cargo world. Pants has done it a bunch and the patches eventually get in that ecosystem.
If Facebook wants to be a bad OSS citizen but you just can't quit them this is the path.
p
I'll do that, but have no hopes of the PR getting merged, it takes months to get even an answer on much more important PRs.
e
Yup!
p
Do you have link to an example of how this pattern works in cargo/rust?
By the way, would it be possible to get the pinned version of torch in
--config-settings
? Currently my
requirements.txt
needs to look like this:
Copy code
detectron2 >= 0.6 --config-settings torch_version=1.12.1+cu113
torch == 1.12.1+cu113
If I set a range for
torch
and regenerate the lockfile, the versions will mismatch. Could I somehow supply a pants target there (
:requirements#torch
)?
Otherwise I have to keep it pinned or lock, update
--config-settings
and relock to have the versions match
e
How about in Python: 1. FOrk the repo and add a
pyproject.toml
with:
Copy code
[build-system]
requires = ["torch", "setuptools"]
build-backend = "setuptools.build_meta"
2. Now use a requirements like
detectron2 @ git+<https://github.com/my-fork/detectron2@abcd1234>
👍 1
Yeah - sorry. On the torch req you must deal with managing that yourself.
It looks like they have a loose >=1.8 req they check and fail fast on.
Just fixed the VCS req - forgot
git+
p
It already fails on the import in setup.py before that... I've worked around that using a small local build backend, that also gets the exact version from --config-settings
e
Up to you. I'd guess just using "torch" without shenanigans will actually work in practice.
p
Copy code
from setuptools.build_meta import __legacy__ as l
from functools import partial


def _get_requires(config_settings=None, legacy_func=None):
    if config_settings:
        return ["setuptools>=40.8.0", "wheel", f"torch=={config_settings['torch_version']}"]
    return legacy_func()


# PEP 517
build_wheel = l.build_wheel
build_sdist = l.build_sdist
prepare_metadata_for_build_wheel = l.prepare_metadata_for_build_wheel
get_requires_for_build_sdist = partial(_get_requires, legacy_func=l.get_requires_for_build_sdist)
get_requires_for_build_wheel = partial(_get_requires, legacy_func=l.get_requires_for_build_wheel)


# PEP 660
build_editable = l.build_editable
prepare_metadata_for_build_editable = l.prepare_metadata_for_build_editable
get_requires_for_build_editable = partial(_get_requires, legacy_func=l.get_requires_for_build_editable)
(in case anyone wants to experiment until I submit a PR and it is merged)
Thanks again for the really fast help!
e
Do you change that requirements.txt
--config-settings torch_version=1.12.1+cu113
bit often? If not you could just kill it, kill all the shenanigans you have to go through above in a custom build backend wrapper and just say
requires = ["torch=1.12.1+cu113", "setuptools"]
and be done with it.
p
It changes every now and then. Optimally it would not be pinned at all and specified by the lockfile, so it can potentially change every time the lockfile is generated
e
Yeah, right now VCS requirements work in lockfiles, but pinning build requirements - VCS or normal sdist - does not.
IIUC no locker in the Python ecosystem handles this today.
p
I know :)
That one I hadn't seen, I just subscribed
a
I would like to shill for a tool i wrote to pin build requirements https://github.com/zmanji/reproducible-wheel-builder
e
@purple-umbrella-89891 you should pressure @ancient-breakfast-45303 to take on https://github.com/pantsbuild/pex/issues/2100
😂 1
@ancient-breakfast-45303 if I ship
--par
by EOD will you do it?
😮 1
OSS betting
p
Wow, that fits really well. I'm still trying to understand the details of @ancient-breakfast-45303's repo.
a
@enough-analyst-54434 alas i have very little time this summer for that kind of coding
@purple-umbrella-89891 it uses pex to create a venv which is the build environment. Then it uses the “build” project from PyPA to call setup tools/etc to make the wheel. The lock file is used to build the venv
“build” is what pip uses under the hood to read the pyproject file and make the wheel.
p
Yup, I got it now. Nice and compact
👍 1