Suppose I have a shared library called `shared` th...
# general
r
Suppose I have a shared library called
shared
that’s a dependency for 2 projects called
A
and
B
. The downstream dependent projects
A
and
B
also have conflicting 3rd party dependencies (for the sake of this example, suppose
A
depends on
pandas<2.0
,
B
depends on
pandas>=2.0
and
shared
does not use
pandas
at all). This means that
A
and
B
must use different resolves (for the sake of this example, let’s say those are called
resolve_A
and
resolve_B
) and
shared
must use multiple resolves with
resolve=parameterize('resolve_A', 'resolve_B')
The
python_sources
target should wind up looking something like:
Copy code
python_sources(
    name="lib",
    sources=["**/*.py"],
    resolve=parameterize("resolve_A", "resolve_B")
)
Now let’s say
shared
should also be distributed as a
python_distribution
.
resolve
is not a parameter for the
python_distribution
and the
python_sources
require a
resolve
value. What should this be? Should it look something like this?
Copy code
python_distribution(
    name="dist",
    dependencies=[
        ":lib@resolve=resolve_A",
        ":lib@resolve=resolve_B"
    ]
)
That seems to work for our purposes, but it’s strange to duplicate the same item in
dependencies
for different resolves.
c
the python_sources use the
resolve=parametrize()
as you showed, it’s the dependency from the distribution that needs to depend on one of them, so you can do that explicitly (not sure if there is a better way)”
r
Sorry, I hit enter pre-maturely earlier.
c
yes, but I would only use one resolve in there
me too, I was about to write an example as well 😛
in the deps for the dist
r
That works unless
A
and
B
also have
python_distribution
targets.
c
oh.. hmm
r
For example, say
shared
has this
BUILD
file:
Copy code
python_sources(
    name="lib",
    sources=["**/*.py"],
    resolve=parameterize("resolve_A", "resolve_B")
)

python_distribution(
    name="dist",
    dependencies=[
        ":lib@resolve=resolve_B"
    ]
)
A
has this
BUILD
file:
Copy code
python_sources(
    name="lib",
    sources=["**/*.py"],
    resolve="resolve_A"
)

python_distribution(
    name="dist",
    dependencies=[":lib"]
)
Building the package for
A
will get an error saying that there is no distribution that owns
Copy code
shared/__init__.py@resolve=resolve_A
which is true, there is only one for
resolve=resolve_B
c
maybe this is a new unresolved case? (not sure) it’s out of my area of knowledge..
I guess what you’d have to do is have two dist targets, one for each resolve of shared
r
There’s more context in this (very long) thread
c
think there may be a ticket covering this.. /searching…
r
This seems to work
Copy code
python_distribution(
    name="dist",
    dependencies=[
        ":lib@resolve=resolve_A",
        ":lib@resolve=resolve_B"
    ]
)
but it seems weird and may lead to other problems
c
yea, I think also the requirements for such a dist would be kind of broken. does it not work to split it into two distinct targets, one for each
dist-A
,
dist-B
?
r
What should be the name passed in
provides
? Should each one get a different name?
When you run
./pants publish
, will that publish 1 package or many?
c
it should publish two, so when you want to install it you have a dependency to the correct one
related issue, that would error in your case with a single dist for both resolves if implemented https://github.com/pantsbuild/pants/issues/14322
r
I realize in this example there is only A and B, but in practice we have 5+ downstream dependents. Maintaining 5+ versions of shared with different names is definitely not what we want.
c
so, maybe have a “union resolve” that just loosely captures the 3rdparty libs that satisfies the constraints of both resolves A and B that you may use for the shared dist
and the shared code uses all three
doesn’t really have the right feeling to it though.. I hope there’ll be a better suggestion coming in for you 😉
r
But the reason we have different resolves is because of conflicting dependencies In this example, I intended to imply that
A
has a dependency on
pandas<2.0
,
B
has a dependency on
pandas>=2.0
and
shared
does not use
pandas
at all. If
shared
were a 3rd-party dependency, then this would not be a problem because
A
and
B
can independently have a dependency on
shared
. … which I suppose is another workaround: treat
shared
like a 3rd-party dependency, but that would require deploying
shared
and regenerating a lockfile everytime it’s chagned.
c
oh, I missed the part that shared didn’t use the conflicting deps..
if you’re not publishing shared, why do you need a python dist for it?
r
Yeah, I’ve updated the OP multiple times after prematurely hitting send the first time 🤦‍♂️
c
I’ll re-read again 😉
r
No, we are publishing shared
We are publishing shared, A and B
c
well that’s kind of a hybrid situation with shared being both first party code and third party. I think this would become easier if we had a small demo repo showing what you’re trying to do that we could work with to find a good solution (at least for me, as I don’t see a good answer straight up).
h
Hmm, the way this should work, I think, is that a python_distribution can only depend on code in a single resolve, because that distribution can only have one set of requirements in its metadata.
What could it mean to have a
python_distribution
be part of multiple resolves?
r
In this example, A and B have conflicting dependencies, therefore they must have different resolves: resolve_A and resolve_B. Both A and B depend on shared, therefore shared must use parameterized resolve for resolve_A and resolve_B. Shared is also released as a python_distribution. At this point, setting either shared@resolve=resolve_A or shared@resolve=resolve_B as dependencies for the shared python_distribution works. This is what we have right now. However, now we want to release A and B as python distributions too and this is where we’re running into a problem. I can create a toy repo example later tonight.
s
it would be sweet if you could have a resolve with somewhat loose requirements that a shared package and dependencies could use, then provide a constraint file in python_distributions which could dial in exactly what version you want to use for that distribution. then you could have multiple distributions using different versions of the same requirement using the same resolve. but I wouldn't be surprised if there's a technical reason that wouldn't work, I haven't really thought it all the way through
r
Ok, I’ve created this toy repo to demonstrate the case I described at the top of this thread. •
src/a:lib
is a
python_sources
target with: ◦ Dependency on
python_requirement
target
src/a:pandas
with
requirements=["pandas<1.5"]
◦ Dependency on
src/shared:lib
◦ Resolve
resolve_a
src/b:lib
is a python_sources target with: ◦ Dependency on
python_requirement
target
src/a:pandas
with
requirements=["pandas>=1.5"]
◦ Dependency on
src/shared:lib
src/shared:lib
is a
python_sources
target with no dependencies ◦ Since
src/a:lib
and
src/b:lib
have conflicting
pandas
dependencies, this
python_sources
target uses
resolve=parametrize("resolve_a", "resolve_b")
src/shared:dist
is a
python_distribution
target for
src/shared:lib
1. At commit fe06c0b, the only
python_distribution
target is
src/shared:dist
and this is successfully built with
./pants package ::
2. At commit bef3bc1, the
python_distribution
target
src/a:dist
is added. Since
src/shard:dist
has a dependency on
src/shared:lib@resolve=resolve_a
and
src/a:dist
has a dependency on
src/a:lib
in the resolve
resolve_a
, this works.
./pants package ::
will successfully build 2 distributions. 3. At commit 55df11f, the
python_distribution
target
src/b:dist
is added. Now
./pants package ::
fails with the error:
NoOwnerError: No python_distribution target found to own src/shared/__init__.py:lib@resolve=resolve_b.
4. At commit 6391a54,
src/shared:dist
has
dependencies
changes to
[":lib@resolve=resolve_a", ":lib@resolve=resolve_b"]
and now
./pants package ::
successfully builds all 3 distributions. For what it’s worth, duplicating the dependency with multiple resolves works to fit our needs, it’s just very bizarre.
c
@happy-kitchen-89482 I think a python_distribution could support multiple resolves, given that the subset of 3rdparty deps used from each resolve is identical.
h
Ah, so it could figure that out I suppose. But it's a very special case.
c
yea maybe. I guess if you develop a reusable library and also a project consuming said library in the same repo. ah ok, and also multiple times with different resolves. Kind of edge case, but ought to work ?
h
I will take a closer look, it feels like we should have a better solution
r
I think a python_distribution could support multiple resolves, given that the subset of 3rdparty deps used from each resolve is identical.
In this toy example, the 3rd-party dependencies is the same, but this is not the case in our actual use case. I’m surprised that this is considered an edge case. Doesn’t this happen all the time if you have a shared library of simple utilities (e.g. organization specific path construction logic)?
c
not if your repo is using a single resolve, this is not an issue. I think this may be(come?) more common than we perhaps realize.. but given the fairly young state of multiple resolves not all of these cases are likely to have been discovered yet. Hence we agree it should be properly supported and I may have been wrong calling it an edge case too 😉 🤷
s
yeah I ran into this exact scenario, I solved it by just not publishing my shared code by itself, only when it is included in a consumer library. publishing the shared code seemed like it could be nice and I initially tried to figure out how to support it but ran into the same issue, but it wasn't a requirement in my case. but yeah just chiming in that I also thought it seemed like it could be a relatively common thing people want to do
👍 1