<@U02M2Q21AJF>: followup re: <multiple resolves>:
# general
w
@nice-florist-55958: followup re: multiple resolves:
3. [The bad]: If
A@resolve=Y
depends on
B@{resolve=X,resovle=Y}
and
B
explicitly lists the
python_requirement
3rdparty/python:RequirementsOfX#SomeRequirement
in its dependencies, then Pants still complains that it depends on said requirement and has incompatible resolves in play. I thought one of the fixes was to auto-infer that when
Y
is in play, that an explicit dependency on a
python_requirement
in
X
would be checked for membership in
Y
so it doesn’t have to itself be parametrized?
no: if target
B
has an explicit dep on a thirdparty target, then the thirdparty target must be in both of the resolves for `B`… we don’t implicitly put targets in multiple resolves for you.
Trying to get out of [3], resorted to parametrizing, but then the ambiguous ownership warning kicks in and the dependency inference fails and
B
doesn’t get included in the dependencies unless itself is explicitly listed in
A
’s dependencies as
B@resolve=Y,dependencies=YDepsAlias
-- and even then, it still complains about having a dependency on
3rdparty/python:RequirementsOfX#SomeRequirement
.
hm. it should not be necessary to
parametrize
the
dependencies
list currently: explicit dependencies should always be filled based on missing parameters if you don’t
parametrize
dependencies, then you shouldn’t have ambiguity in this case (because something in a separate resolve cannot be ambiguous)
Tried something just to see - removed the parametrization of
B
’s 3rdparty deps and changed the listed dep of
A
to merely
B@resolve=Y
(hoping this is what you meant by still needing to explicitly list resolves), but no luck, same result as [4]
when you say “same result” you mean that you saw ambiguity? what was the actual ambiguity warning?
While probably sub-optimal, I’m not aware of a way to override a requirement when you give the generator a source file, or to feed multiple files with the duplicate requirements masking priors similar to a dictionary update, etc. which would make such an organization scheme easier to create/manage.
the overrides parameter for
python_requirements
can be used to put some of the requirements in multiple resolves, but yea: using separate requirements files seems like a good place to start.
one other thing that’s very likely to be useful while you’re doing this is to set
[python-infer].unowned_dependency_behavior = "warning"
… it will warn you when one of the dependencies of a target cannot be located, and if it is present in another resolve, it will point that out. we’ll likely enable it by default in
2.14
n
So if our req:resolve 1:1 makes sense to you and we should not need to parametrize third party reqs in the deps that must be explicitly listed since they are not inferrable, what is the solution here? Is it to list both requirements in
B
dependencies w/o parametrizing and then Pants would limit the actual depa the one based on which resolve is in play at buildtime? I'm confused.
I'll get back to you on your other questions when I can test again…might be a couple days. Maybe I'll spend some more effort reorganizing reqs into resolves in a heirarchal manner—that should fix the problem here, at least when the listed deps in B are factored into all of the resolves it parametrizes.
So maybe just to have an explicit example in mind. Say we have
{R1, R2, R3, R4}
pip requirements in
R_X.txt
and
python_requirements(name="X", source="R_X.txt", resolve="py-X")
located in
./3rdparty
. The requirements generally allow the latest version to be used and have lower bounds to the latest major version.
py-X
is palying the role of
python-default
in practice--pretty much everyone will work with it and use it. Some app that uses a lot of first-party code depends on a lower major version of
R2
. Our previous strategy was to copy
R_X.txt -> R_Y.txt
and change the version of
R2
, as well as modify versions, add, remove, etc. as needed to support this downgraded
R2
. Then we had
python_requirements(name="Y", source="R_Y.txt", resolve="py-Y")
. I think this setup is fundamentally flawed (unless you can point out the solution). While it works fine when you don't need to list any 3rdparty requirements as explicit dependencies, things break down when you do. Going back to the original example,
B
lists
R1
as a requirement explicitly. Addressing it specifically as
3rdparty:X#R1
, parametrizing it, etc. all lead to some sort of failure when resolve
Y
is in play. If it was able to be inferred by Pants to begin with (say if it was explicitly imported), then I think everything just works, because Pants would know
Y
is in play and see if any requirement that fed into
Y
is indeed a
python_requirement
target defined somewhere . That is the behavior I was hoping would happen when the requirement is explicitly listed as a dependency, but I think that is not ever going to be the case. So I have two solutions in mind: 1. Remove the explicit listing of 3rdparty requirements in a source's dependencies, and instead add an explicit import somewhere (even if it's unused in that particular module) so that Pants will use its infer / inverse mapping logic from the resolve
Y
being in play 2. Restructure the requirements/resolves: a. Redefine the
X
based requirements as
python_requirements(name="X", source="R_X.txt", resolve=parametrize("py-X", "py-Y"), overrides={"R2": {resolve: "py-X"}})
b. Redefine the
Y
based requirements in "delta" format:
pytho_requirements("Y", source="R_X_dY.txt", resolve="py-Y")
where
R_X_dY.txt"
contains only the downgraded pip requirement for R2 (as well as anything else that needed to be overrided in the original
R_X
) c. Now
B
can safely list explicitly
3rdparty:X#R1
whether
py-X
or
py-Y
are in play (the resolves it parametrizes) And if you're curious why I am putting so much effort into this...we have a lot of apps that are based on
Flask~=2.0
and another app that is a job scheduler based on
apache-airflow~=2.0
. Unfortunately, Airflow currently requires `Flask<2.0`and the nature of the job scheduler is such that it depends on lots of other first-party library code (because the jobs people want to schedule on it do and are varied) Therefore, it put us in this problematic situation. But I think Option 2 is actually cleaner than the original design, so I hope that will work when I try it out..