Hey all, I’m hoping for some advice re: the new re...
# general
f
Hey all, I’m hoping for some advice re: the new resolves feature. I’m a huge fan, and it will be very useful for us, but I’m running into issues. Our monorepo has a common python_sources target called
src/common:lib
which depends on some 3rd-party python packages, and many other modules which each have their own additional 3rd party dependencies. These other modules follow a pattern of
python_requirements
->
python_sources
->
pex_binary
->
docker_image
->
python_tests
. Their
python_sources
all declare a dependency on
src/common:lib
. My intention is to have all the dependent modules inherit the python dependencies of
src/common:lib
, but I can’t do that without multiple resolves. When I try that, I get errors trying to reference those targets. Example:
Copy code
python_requirements(resolve="align")

python_sources(
    name="lib",
    sources=["src/*.py"],
    dependencies=[
        'src/common:lib',
        '#pandas',
        '#pysam'
    ],
    resolve=parametrize("align", "common")
)


pex_binary(
    name="exe",
    entry_point="src/align.py",
    dependencies=[':lib'],
    shebang="/usr/bin/env python3",
    platforms=[
        "current",
        "manylinux2014_x86_64-cp-39-cp39",
    ],
    resolve=parametrize("align", "common")
)


docker_image(
    name="docker",
    image_tags=["{build_args.GIT_BRANCH}", "{build_args.GIT_COMMIT}", "{build_args.RELEASE_VERSION}"],
    dependencies=[
        "src/common/docker:python_base",
        ":exe",
    ]
)


python_tests(
    name="tests",
    sources=["tests/test_*.py"],
    dependencies=[
        ":lib",
        "src/test_framework:lib",
        ":test_data",
    ],
    runtime_package_dependencies=[":docker_test"],
    tags=["docker_required"],
    resolve=parametrize("align", "common")
)
When I test this, I get:
Copy code
ValueError: The address `src/stages/align:exe` was not generated by the target `src/stages/align:exe`, which only generated these addresses:

  * src/stages/align:exe@resolve=align
  * src/stages/align:exe@resolve=common

Did you mean to use one of those addresses?
Any idea how to work with this? I tried setting the
:exe
targets to
:exe@resolve=align
and
:exe@resolve=common
, and even tried both at once - but each time they errored out because neither resolve has the full set of dependencies.
e
Can you provide just a bit more surrounding detail? For the "When I test this, I get" can you include the command line you run (and full output). Ditto for "but each time they errored out because neither resolve has the full set of dependencies." - a fully command line plus all output?
f
Sure, here’s an example from the pex_binary target:
Copy code
pex_binary(
    name="exe",
    entry_point="src/align.py",
    dependencies=[
        ":lib",    
    ],
    shebang="/usr/bin/env python3",
    platforms=[
        "current",
        "manylinux2014_x86_64-cp-39-cp39",
    ],
    resolve=parametrize("align", "common")
)
Copy code
./pants package src/stages/align:exe
19:33:09.61 [ERROR] 1 Exception encountered:

  ValueError: The explicit dependency `src/stages/align:lib` of the target at `src/stages/align:exe@resolve=align` does not provide enough address parameters to identify which parametrization of the dependency target should be used.
Target `src/stages/align:lib` can be addressed as:
  * src/stages/align:lib@resolve=align
  * src/stages/align/src/__init__.py:../lib@resolve=align
  * src/stages/align/src/align.py:../lib@resolve=align
  * src/stages/align/src/bowtie2.py:../lib@resolve=align
  * src/stages/align/src/fastp.py:../lib@resolve=align
  * src/stages/align:lib@resolve=common
  * src/stages/align/src/__init__.py:../lib@resolve=common
  * src/stages/align/src/align.py:../lib@resolve=common
  * src/stages/align/src/bowtie2.py:../lib@resolve=common
  * src/stages/align/src/fastp.py:../lib@resolve=common
When I try adding the resolves:
Copy code
pex_binary(
    name="exe",
    entry_point="src/align.py",
    dependencies=[
        ':lib@resolve=align',
        ':lib@resolve=common'
    ],
    shebang="/usr/bin/env python3",
    platforms=[
        "current",
        "manylinux2014_x86_64-cp-39-cp39",
    ],
    resolve=parametrize("align", "common")
)
Copy code
./pants package src/stages/align:exe
19:30:08.59 [ERROR] 1 Exception encountered:

  NoCompatibleResolveException: The target src/stages/align:exe@resolve=common uses the `resolve` `common`, but some of its dependencies are not compatible with that resolve:

  * src/stages/align/src/__init__.py:../lib@resolve=align (align)
  * src/stages/align/src/align.py:../lib@resolve=align (align)
  * src/stages/align/src/bowtie2.py:../lib@resolve=align (align)
  * src/stages/align/src/fastp.py:../lib@resolve=align (align)
  * src/stages/align#pandas (align)
  * src/stages/align#pysam (align)

All dependencies must work with the same `resolve`. To fix this, either change the `resolve=` field on those dependencies to `common`, or change the `resolve=` of the target src/stages/align:exe@resolve=common. If those dependencies should work with multiple resolves, use the `parametrize` mechanism with the `resolve=` field or manually create multiple targets for the same entity.

For more information, see <https://www.pantsbuild.org/v2.12/docs/python-third-party-dependencies#multiple-lockfiles>.
Similarly, if I limit it to just one resolve or the other, I get the same error messages with different sets of dependencies
e
Why is the
pex_binary
parametrized? Generally a root node like that will just have 1 resolve. It's things like shared libs (common) that will (need to) work with multiple different roots, each with potentially different single resolves, depending on them.
f
Ah, I misunderstood - so the common library should be parameterized for every resolve that wants to depend on it
👍 1
I will give that a try. Thank you!
e
Yeah. One way to think of this is a resolve == a lockfile. Generally only binaries use lockfiles and libraries do not (they need to support a range to be useful to many consumers).
When I say generally, I mean outside Pants and across languages. It's a general piece of common advice about when to check in a lock file.
f
Would you be open to a feature request along the lines of
resolve=None
, meaning “I don’t use resolves, but if another target depends on me and it does use resolves, include my dependencies in that resolve”?
I only ask because this approach seems like it flips the dependency model on its head - my understanding is that before this, targets didn’t have to know anything about the targets that depend on them
e
I don't know enough about the design of multiple resolves to answer whether that feature request makes sense or if it is already accommodated by some feature or configuration I'm ignorant of. @witty-crayon-22786 or @hundreds-father-404 should be able to provide more concrete feedback to that after the 4th (they're both observing a US holiday).
Ideally you'd only label conflicting requirements with different resolve names and then, also ideally you'd only label targets that had a dependency on the conflicting requirements and the rest would be calculated or fail if no solution. I'm guessing there were either perf issues with that or else we took an incremental approach (from the dev side) and did the simplest thing 1st, hoping to make more ergonomic later.
h
Hi @few-arm-93065, maybe this is a helpful way to think about it: if some library of yours consumes third-party code, it has to declare which version(s) of that third-party code it's compatible with. That's what the resolves are. Targets don't know anything about the targets that depend on them - your intuition is correct on that score. But targets do need to say "my code is compatible with these versions of this external requirement". That is a property of its own code, not of any of its dependees.
Does that make sense?
e
I think to his point, in a different world it would make sense that if libA depends on requests which is declared singly as requests>2, then there is no ambiguity and Pants should be able to figure out the resolve.
👍 1
Right now there is a ton of boilerplate.
It's not obvious (to me) there needs to be.
I'm pretty sure this is down to a perf / infeasiility or at least a deal of effort to make work reasonably fast thing.
h
Would you be open to a feature request along the lines of resolve=None, meaning “I don’t use resolves, but if another target depends on me and it does use resolves, include my dependencies in that resolve”?
So the tricky thing is that normally the
python_source
target is only used as a dependency and not a "root". But MyPy and linting throw that off. See https://github.com/pantsbuild/pants/issues/12714 for the backstory on that
e
I think the problem is something like https://github.com/pantsbuild/pants/issues/12714#issuecomment-909651819 doesn't exist yet to reduce boilerplate.
h
sort of -- there's
[python].default_resolve
, and then Andreas recently added a flagship feature
__defaults__()
that lets you set the default for a whole subtree etc https://github.com/pantsbuild/pants/pull/15836
e
But the crux is the field itself is not plural. We went a different path with parametrize. So it seems to reduce boilerplate currently requires a macro.
Maybe
__defaults__
accommodates
parameterize(...)
values? That would work if so.
f
I’m in over my head here, just wanted to say thank you for considering it, that would be very useful
w
yea,
__defaults__
+
parametrize
should allow you to say “all of the things in these subdirectories need to build with multiple resolves”. as mentioned, that’s not currently possible with the
[python].default_resolve
option.
as Eric said: things like
mypy
are why even libraries need to know what they will eventually run with
1
f
Just to reinforce something John said earlier, a declaration of a 3rd party dependency is very different from a lockfile. Before the resolves feature, pants and mypy already knew how to resolve dependencies and typecheck against them without a lockfile. The same codepath should be available for libraries that have dependencies but don’t use resolves, correct?
w
One way or another, there is at least one "resolve", even when the
resolves
feature is disabled (that's maybe not the best name for the flag). And yes: dependency inference works the same way with one resolve (with or without multiple resolves enabled).
If you have multiple resolves declared, each target needs to know which resolve(s) it should use (one, both). But it can't use "neither" resolve: it must get its dependencies from somewhere.
There is an existing way to set the "default" resolve, and it works if you want the default to be exactly one resolve. If you want code to default to using both resolves, you'd need the new _`__defaults__`_ feature.