is there an easy way to determine the dependencies...
# general
b
is there an easy way to determine the dependencies from one python_distribution to another python_distribution? Currently
pants dependencies --transitive
shows the file dependencies from the source, but not the dependencies between the dists
g
What do you mean with "dependencies between the dists", specifically?
python_distribution
is primarily used as a packaging/publishing tool (e.g. to wheels) while most actual work happens only with files.
c
pants paths --from=target --to=target
maybe not what was asked for….? 😬
b
so, I've got a python_distribution A' that provides direct source code we'll call A. The code is standalone and can be used as a wheel in several different places. Now say I have a server which uses this code in it's own library which we'll call B. The python_distribution created from this we'll call B'. Now when I do a build of B, pants interrogates the source targets and will resolve the .py code from A. When B' is built, this is not reflected in the files being imported into the dependent package, but rather is set as a dependency from B' to A'. So when the wheel for B' is built, it magically has the requires property set to the A' wheel with the appropriate version (exact, compatible, etc).
c
ah, yes. So you’d like to list which in-repo distributions a certain file/dist depends on?
b
yes.
c
good question… 🤔
I don’t know if we have any goal that presents that slice.. huh, interesting. Feels like it would be good to be able to do.
b
I'm in a weird situation where I need to produce an RPM with absolutely guaranteed wheel versions. The development needs to be loosely versioned so that we aren't killing our devs with dependency hell. But the actual package produced must be constrained. I can cheat and do a publish, then download after setting the first party dependency to exact, but that' weird when I have the wheels right there.
and I don't want to package all of the wheel files as some of them aren't relevant to the services we are putting together.
Then there is the whole, how do I generate a constraints.txt file from a .lock. I'm leaning towards creating a pex file and just copying the results from the .deps, but not 100% sold this is the way to go.
c
still 🤔 …. (I’ll run off do some testing of ideas…)
b
if there is a way to have the .pex file follow dependencies of the first party wheels, then my problem is solved. But when I put in the
:dist
target into the dependencies, it only adds the immediate dist and not any of it's first party dependencies. It'll create a directory that contains the files, but that's not a wheel and not something that flask supports
all of this is because we have to support an older OS, httpd using mod_wsgi, and flask
c
yea, there’s always all of these legacy setups to consider, isn’t there..
b
🤷 seems like my lot in life
and to make it worse, it's all old data sciences, ml, and ai libraries which are barely packaged in a sane fashion (if at all)
c
ouch, yea that lot.
b
so can you think of a way that we can add the dependencies dynamically to a pex build? If that could be done, that would be ideal. I can put the targets in one at a time, but that'll be quite the maintenance nightmare as this project is we'll over 20+ wheel files, each at differing versions
specifically the python_distributions?
c
how are you building the RPM?
so, I have no good answer to your exact question, unfortunately.. but I think you’d be interested to use this, when available: https://github.com/pantsbuild/pants/pull/19308
b
As of now, I'm not. Marshaling things into the directory so I can generate spec file is what I'm waiting to do.
I'm really not worried about the rpm part, but more about getting the correct wheels downloaded and properly constrained
but that packaging plugin looks boss 🙂
1
c
weren’t the wheels also from the monorepo?
b
yes, they are
but each server only needs a subset of the total number of wheels
c
ah, so it’s a deployment knot to solve.. and one way of doing it is with bundling wheels up into server specific bundles. ?
b
yes.
actually, it's the only way I'm allowed to do this.
c
🙂
b
it was done this way around 10 years ago, and our customer insists that we don't change things
very risk averse
c
what inputs do you have that determines which wheels go into each bundle, if we’ll call them that
b
so far pants has done a really good job of building the server wheel with the correct dependencies. But since we are also using the wheels as utilities elsewhere they need to be published. And since they are published, our 'compatible' first party wheel requirements don't allow us to simply download the server wheel and it's dependencies. So when building the wheel, I have to know the full set of dependencies, and the exact artifacts produced in the monorepo to reproduce the build.
err, building the rpm, not the wheel
I've toyed with adding a command line flag to set the first party dependencies to exact and that may be the way to go. But since there isn't a third party flag, I have to do a separate step to marshal up the third party dependencies.
I really want to do this in one step rather than having to make so many gyrations.
c
(got distracted by a production issue.. back now, reading up… 😉 )
ok, I still have a fractured image of what you’re dealing with, but to break it down into hopefully some useful pieces.. lets first address
I have to know […] the exact artifacts produced in the monorepo
Copy code
pants list --filter-target-type=python_distribution :: > dists.spec
pants --spec-files=dists.spec peek --exclude-defaults > dists.json
jq -r '.[]|"\(.provides.name)--\(.provides.version)"' dists.json
will give you something like:
Copy code
hello-dist--0.0.1
mypyc_fib--2.3.4
native--2.3.4
dummy-plugin--0.0.0
but if the server wheel gets built properly, isn’t it easiest to extract the required details from that (and I agree, pants should be able to tell you this.. heh)?
b
here's a spitball idea: using your method, combined with env vars in a BUILD file you can do something like:
Copy code
dist_deps_env_val=env("DEPS")
dist_deps=dist_deps_env_val.split(',')

pex(
)
myvar=env("MYTEST_DEPS") print(myvar.split(',')) ``````
where pex dependendencies are assigned from your dist_deps
c
right.. looks rather janky, but if you get that to work, perhaps file a feature request for the missing information in the mean time? 😄