When building a wheel using `python_distribution` ...
# general
p
When building a wheel using
python_distribution
is it expected that created wheel only requires (depends on) by default on things that are directly imported?
The use case is that we have a python file importing
pandera
, and that depends on
pydantic
requirement file looks like this:
Copy code
pandera==0.11.0
pydantic==1.10.13
metadata of created wheel:
Copy code
Metadata-Version: 2.1
Name: main
Version: 1.0.0
Author: Pantsbuild
Requires-Python: >=3.8.0
Requires-Dist: pandera (==0.11.0)
normally I wouldn't include pydantic, but we want to pin it to a version 1.x instead of 2.x that is installed by default - but we never directly use pydantic in our codebase
so, it looks like I have to manually specify pydantic as dependency in
python_distribution
, otherwise it's not automatically included in created wheel
Copy code
python_distribution(
    name="wheel",
    dependencies=[
        "src/main.py",
    ],
    sdist=False,
    wheel=True,
    provides=python_artifact(
        name="main",
        author="Pantsbuild",
        python_requires=">=3.8.0",
        version="1.0.0"
    ),
)
this is what I have in BUILD file so far, and main.py looks like below:
Copy code
import pandera as pa

def main():
    print("Hello World!")

if __name__ == "__main__":
    main()
installing
pandera==0.11.0
in a fresh venv produces this:
Copy code
$ pip list
Package           Version
----------------- ------------
annotated-types   0.6.0
mypy-extensions   1.0.0
numpy             1.26.3
packaging         23.2
pandas            2.1.4
pandera           0.11.0
pip               23.0.1
pyarrow           14.0.2
pydantic          2.5.3
pydantic_core     2.14.6
python-dateutil   2.8.2
pytz              2023.3.post1
setuptools        66.1.1
six               1.16.0
typing_extensions 4.9.0
typing-inspect    0.9.0
tzdata            2023.4
wrapt             1.16.0
so clearly pydantic is installed with pandera, but pants has no way of figuring this out if it's not imported directly, correct?
thanks a lot for clarification 🙂
s
You want to pin the transitive dependency in
install_requires
of your python distribution, but you don't actually import or use it directly, right?
p
Yes, we only use pandera directly and it requires pydantic that is not directly called in our code, but I want to pin pydantic version
s
Ok, so why do you want to control it on package distribution side? Usually this is controlled on the library consumer side with something like constraints.txt
If you don't really need a python distribution, but just want to run your code built by pants, then take a look at the pex_binary target https://www.pantsbuild.org/docs/reference-pex_binary It will only use dependencies from your pants lockfile which will guarantee pydantic<2
p
Long story short is that I want to build a single wheel with utils for our data science team, installed on Databricks. PEX is not an option even though it would solve it... We use mlflow and pandera. Both seem to use pydantic and do not specify version constraint, but mlflow fails when working with pydantic 2.x, that's why I wanted to pin it. I did that in requirements only to find out that pants doesn't translate that requirement to wheel dependencies automatically.
So that's really something I shouldn't be fixing in a first place but there is an issue somewhere between Databricks runtime, mlflow and pydantic 🙂
s
Does mlflow declare pydantic<2 in it's dependencies?
p
No, that's the thing, it happily installs 2.x 🙂
s
Then you can try to add it manually when you install your library Something like
requirements.txt
Copy code
my-util-library==0.10.3
pydantic<2
Then install with
pip install -r requirements.txt
👍 1
p
Anyway, the only question I had was if there's a way for pants to also include transitive dependencies of 3rd party packages in the wheel dependencies, as it does in the lockfile for example
Then you can try to add it manually when you install your library
I add it to dependencies in BUILD file for python_distribution target and that solves it, but I wondered if there's a better way, it may not be the only pinned requirement to add there
s
You can add something like this to your python_distribution to set the dependency manually:
Copy code
python_distribution(
    ...
    dependencies=["requirements-target:pydantic"]
)
(Replace requirements target with the actual path) This should add it manually But again, logically I would add the constraint on install side
👍 1
p
Thanks!
s
You're welcome! Feel free to ask questions if it still doesn't work