I ve been wondering about the following Today if `python lib Pants #development

I've been wondering about the following: Today if ...

happy-kitchen-89482

07/02/2021, 9:16 PM

I've been wondering about the following: Today if

python_library

X depends on a

python_distribution

Y then X effectively depends on all of the libraries Y depends on, transitively, as sources, and nothing special happens with Y itself. You could replace it with its direct deps with no change in behavior. But what the author of that dep probably intended is for X to depend on the wheel built from Y. An example of where this is useful is when you have a

python_distribution

that builds native code via a custom

setup.py

(this is now possible! See https://github.com/pantsbuild/pants/pull/12250). AFAICT today we handle this as a special case, only for running tests, via

runtime_package_dependencies

. But it seems like we should do this generically?

happy-kitchen-89482

07/02/2021, 9:16 PM

@hundreds-father-404 I can't remember what the thought process was behind

runtime_package_dependencies

vs just

dependencies

happy-kitchen-89482

07/02/2021, 9:18 PM

Well, there are two issues here: One is whether traversing deps should (at least by default) not traverse through binary deps. That problem goes away with

runtime_package_dependencies

happy-kitchen-89482

07/02/2021, 9:19 PM

But in that case it seems like any target could have

runtime_package_dependencies

and depending on it means depending on the artifact it creates? So at the very least what we do today for tests we should do uniformly?

hundreds-father-404

07/02/2021, 9:20 PM

Implementation wise, it's much easier to do via a separate field. NB that it is a semantic error to include the sources of a runtime_package_dep, and you should only include the built artifact. We would have to add lots of special casing to our generic dependencies code to filter like that UX wise, I also think it is worth the clarity to users that we are going to do something special here: build an artifact

hundreds-father-404

07/02/2021, 9:21 PM

But in that case it seems like any target could have runtime_package_dependencies and depending on it means depending on the artifact it creates?

What does it mean for a Pex to depend on a built PEX? Or some Python library code to depend on a built PEX?

average-vr-56795

07/02/2021, 9:31 PM

Does that make `python_distribution`s a blackhole for dep inference?

average-vr-56795

07/02/2021, 9:31 PM

Or just something that needs some kind of special-casing?

enough-analyst-54434

07/02/2021, 9:34 PM

I had weakly proposed dependencies that spelled out products. Bazel had this but I think its now deprecated or removed. IE:

dependencies=['a/target:address@whl']

Where `@`'s would be registered by rules and dependencies on `@`'s would get the corresponding registered rule run against them with the product being a file(set).

enough-analyst-54434

07/02/2021, 9:35 PM

Lots of handwaving there - but the idea is a general mechainsm instead of an ad-hoc mechanism like we have now for python tests.

enough-analyst-54434

07/02/2021, 9:36 PM

https://github.com/pantsbuild/pants/issues/8669#issuecomment-723363977

witty-crayon-22786

07/02/2021, 9:37 PM

my understanding of bazel is that you would affect the “type” of your dep on something via putting it in either the

deps

data

list. @average-vr-56795 would know better about whether the syntax still exists.

witty-crayon-22786

07/02/2021, 9:40 PM

but yea, the

syntax got co-opted into “variants”, which eventually became multiple Params, and the syntax was dropped (although we still reserve the character)

witty-crayon-22786

07/02/2021, 9:41 PM

the target API could still implement parsing to flavor deps like that… one of the surviving potential usecases would be multiple types of codegen at the destination.

witty-crayon-22786

07/02/2021, 9:43 PM

(things like “which resolve am i using” and “which JDK do i want” are properties that would be set for an entire consuming target, rather than on a dep-by-dep basis, so they don’t really need

BUILD

syntax)

average-vr-56795

07/02/2021, 9:44 PM

I don't think bazel has ever offered that kind of syntax exactly - I know Buck does and calls them flavours. Bazel has a couple of approaches to this - the main one is the separate attribute thing (e.g.

deps

runtime_deps

); another is that you can name explicit files in some contexts (e.g. in a pkg_zip rule you can reference

:foo

foo_deploy.jar

and the latter will be a fat jar of transitive deps) but that only really applies when you're depending on a literal file copy, rather than wanting metadata like a classpath. A key difference between Bazel and Pants here is that in Bazel each rule pushes providers up the graph (so the dependency decides in what ways it can be consumed, and the depending rule picks one of them), whereas Pants pulls rule classes up the graph (the depender can decide to do interesting transforms itself)

enough-analyst-54434

07/02/2021, 9:45 PM

Yes - exactly, _deploy.jar - that's what I'm referring to @average-vr-56795. @witty-crayon-22786 the

is a distraction - I intend none of the cross meaning with the old

- just a disambiguator since I lazily assumed

may already be allowed in address names.

average-vr-56795

07/02/2021, 9:46 PM

(FWIW "flavours" got kind of terribly overloaded in Buck - they ended up becoming both "I want this particular kind of output" and also "I want to reconfigure this target" - you would use flavours both to differentiate "I want a .a" vs "I want a .so" but also "I want an x86_64" vs "I want an arm64" - and while those are similar concepts, the combinatorial explosion of them in one "address space" got very confusing (and also inefficient)

👍 1

witty-crayon-22786

07/02/2021, 9:48 PM

re: push vs pull in the rule graph… the target API ended up being shaped somewhat similarly to bazel providers and/or the “multiple different output files” of a rule: you request a particular

Field

type for a target, and if it is not declared literally on the target, it can be computed for the target

➕ 1

witty-crayon-22786

07/02/2021, 9:50 PM

so from a type perspective, there is already a facility to produce either X or Y for a target… maybe just not syntax to pull on one or the other…?

average-vr-56795

07/02/2021, 9:51 PM

I quite like the idea of modelling a

python_distribution

as something which in some way pretends to be a

python_library

without any dependencies as far as things depending on it are concerned, but retains the dependency information for other contexts where that's needed... I'm not sure what exactly that would look like model-wise these days

witty-crayon-22786

07/02/2021, 9:51 PM

but also, i think that

Fields

are still conceptually treated as inputs… even when they might be a

PythonSources

field generated from a

ProtobufSources

field (example from HEAD)

happy-kitchen-89482

07/02/2021, 9:53 PM

The same as depending on a

files

target containing the built artifact

average-vr-56795

07/02/2021, 9:54 PM

Except you still want dep inference, right? Or is that a separate type that would get consumed a separate way?

witty-crayon-22786

07/02/2021, 9:58 PM

@happy-kitchen-89482: in order for

python_library

consuming

python_distribution

to mean different things in different contexts, we either need new syntax or new attributes. we don’t have the

data

deps

split, so we only have one list. if we had two attributes you’d put the

python_distribution

in your

data

list if you wanted loose files, or

deps

if the other thing.

happy-kitchen-89482

07/02/2021, 9:58 PM

I guess we have the dep inference issue today with the special case for tests

happy-kitchen-89482

07/02/2021, 9:58 PM

I don't think it needs to mean different things in different contexts

happy-kitchen-89482

07/02/2021, 9:59 PM

I think it can always mean one thing: I depend on the package produced by this package-producing thing.

happy-kitchen-89482

07/02/2021, 10:00 PM

What other reason could there be to depend on a package-producing thing

happy-kitchen-89482

07/02/2021, 10:00 PM

Dep inference is an interesting question (and one we already fail to answer, I think)

average-vr-56795

07/02/2021, 10:02 PM

I think the other reason for not wanting to treat it completely opaquely is things like classpath conflict checking

happy-kitchen-89482

07/02/2021, 10:03 PM

for the motivating use-case (building native code in a custom

setup.py

) we don't have to worry about dep inference creating conflicts

average-vr-56795

07/02/2021, 10:05 PM

So the wheels we're taking about are pure native code, and if they vendor any Python they do some kind of shading-like thing?

happy-kitchen-89482

07/02/2021, 10:06 PM

Or the user accepts that there may be a syspath ambiguity

happy-kitchen-89482

07/02/2021, 10:06 PM

I mean, it may not be a real problem, they are the exact same .py files

happy-kitchen-89482

07/02/2021, 10:06 PM

they just exist twice, once in the wheel and once loose, as sources

happy-kitchen-89482

07/02/2021, 10:06 PM

and are loaded from whichever has priority (presumably the wheel)

happy-kitchen-89482

07/02/2021, 10:07 PM

in fact we would have to ensure that the wheel takes priority

happy-kitchen-89482

07/02/2021, 10:07 PM

to avoid weird importlib introspection edge cases

average-vr-56795

07/02/2021, 10:09 PM

I'm imagining e.g. some packages where there's a pure Python version, but if you're happy to pick up a native dep you can get a more optimised version, and I'm not sure how having two copies of that would go down. But I guess always picking the most locally built wheel probably works fine. But I guess it could be cleaner to be able to explicitly notice the conflict. I guess this ends up looking like the

provides

attribute on v1 JVM rules? Tagging the explicit metadata so you can handle the conflict smoothly

3 Views

Open in Slack

Previous Next