Is this error expected? ```Exception: Invalid Get....
# development
h
Is this error expected?
Copy code
Exception: Invalid Get. Because the second argument to `Get(BuiltPackage, PythonDistributionFieldSet, PythonDistributionFieldSet(address=Address(testprojects/src/python/hello/native:dist), provides=<class 'pants.backend.python.target_types.PythonProvidesField'>(alias='provides', address=testprojects/src/python/hello/native:dist, value=<pants.backend.python.macros.python_artifact.PythonArtifact object at 0x107324b38>, default=None)))` is annotated with `@union`, the third argument should be a member of that union. Did you intend to register `UnionRule(PythonDistributionFieldSet, PythonDistributionFieldSet)`? If not, you may be using the wrong type (PythonDistributionFieldSet) for the third argument.
PythonDistributionFieldSet is not an @union, but its superclass (PackageFieldSet) is.
I would have expected to be able to request for a specific member of a union, ignoring the union entirely, no?
h
Yes, unfortunately. Limiation of Python itself: when you subclass a union base like
PackageFieldSet
, you inherit the
@union
stuff. So the engine naively thinks that your subclass is a union base too I don't know how to fix this. In Python, how do you have a superclass have a property but not any subclasses?
This has been a big pain point, it understandably confused @ancient-vegetable-10556 last week
a
I have a vague idea, but it’s really really ugly
Copy code
def union(cls, name: str):
   if cls.__name__ != str:
      return cls
   
   # do actual decoration here
   return cls
^--- this is not actually correct, as
union()
would need to return a decorator function, but the idea is that you give
@union
enough information to figure out if the decorator should bail for subclasses.
👍 1
It definitely makes the
@union
decorator less eggplant though
🍆 1
errrm elegant, rather
🤣 1
h
that's totally fine imo, the decorator is encapsulated. More important is that it works properly than that the implementation is elegant
a
Copy code
@union("PackageFieldSet")
class PackageFieldSet(FieldSet, etc etc etc):
   etc
h
Yeah that's acceptable imo. Unions are not typically defined by plugin authors, more by us so that we can give plugin hooks for people to write plugins
a
@hundreds-father-404 Write a ticket, and I’ll take a stab at it. There’s probably a better solution in there somewhere.
ticket doesn’t need to be detailed, just a reminder of why I’m doing this, and an example of where I could use it
h
not exactly the same thing, but related enough: https://github.com/pantsbuild/pants/issues/10618
(another issue with unions, but separate thing: https://github.com/pantsbuild/pants/issues/11430)
c
Copy code
def union(cls, name: str):
   if cls.__name__ != str:
      return cls
   
   # do actual decoration here
   return
Would this actually work? The decorator is on the base class only, child classes would still inherit the “actual decoration” from the base…
a
Yeah — you can bail out of the decorator before you do the actual decoration setp
so while the decorator is still invoked, you can make it a no-op for the subclasses
👍 1
it’s just making sure that you have enough info to bail out in an informed manner
c
Ah, ok. Will have to look at the code to see where this lives, then. ;)
(On the phone…)
a
I’m having a go at implementing this now, as it’s frustrating enough and my brain is in the right place 🙂
🚀 1
🙌 1
w
also, definitely consider … completely overhauling unions instead!
1
there are many awkward bits.
a
Yeah, this is at least a useful first step
(should it work)
It doesn’t, but I see something that I was previously missing
h
Have sometimes wondered if we could replace unions with subclassing (in many cases the union members do in fact subclass the union class)
a
Yeah, that would certainly have made the thing I was doing that involved unions a bit easier
Interestingly here, I think there may be an actual bug in how we’re using the union decorator
w
Have sometimes wondered if we could replace unions with subclassing…
@enough-analyst-54434 has explored that. but i think that a really important portion to preserve is optional registration. so you wouldn’t actually be able to use the inheritance hierarchy to decide which instances were registered.
(example: in my unit test of one FieldSet subclass, i shouldn’t get a bunch of other FieldSets pulled in)
h
Fair point.
w
https://github.com/pantsbuild/pants/issues/11354 discusses some subclassing benefits… but not from the point of view of registration
h
But this is essentially subtyping, no? If so the word
union
is a little confusing
w
in most cases, yes: the member is a subtype.
but yes: you almost always have a particular interface for the
@union
member (to construct it/call it/etc), so requiring that it is a subtype would be reasonable. it’s just somewhat orthogonal to the registration problem
h
What's the workaround here, meanwhile?
w
fixing the
@union
decorator, probably. @ancient-vegetable-10556: are you still looking at that?
h
Well, I can temporarily work around by requesting the union instead of the subclass, knowing in practice that I will only get instances of the subclass back
a
@witty-crayon-22786 Yeah, I spent some time digging, and it looks like there’s some weird interplay between how
@union
works and how we resolve rules in the graph
c
I find this topic intriguing, will play around with it a bit, see what I learn… (next week)
❤️ 2
a
@curved-television-6568 I’d love to have a chat about it. It seems like there’s some attempt to do weird stuff with the type system that doesn’t quite work
👍 1
c
I’ll reach out to you some day next week, then :)
h
Yeah I don't think registration should be via subclassing. This would break allowing you to write intermediate util classes, as those would now be union members unintentionally I agree with Stu that you should continue to be explicit in registering via UnionRule(Base, Member), rather than some implicit registration based on subclassing But yes, we should still fix the weirdness around unions
w
@ancient-vegetable-10556, @curved-television-6568: apart from the metaclass magic that obscures things, the implementation of unions is surprisingly simple: it boils down to converting a
Get(X, Y, ..)
(where
Y
is a union) into a series of
Get(X, $member, ..)
for the union members… which happens here: https://github.com/pantsbuild/pants/blob/731c855630a9eed6c68fdf6129541dc6a69ec3e1/src/python/pants/engine/internals/scheduler.py#L658-L666
👍 1
a
I think the obscurification is the problem here
w
yea. the
@union
decorator used to literally just poke a
_union
field onto a class (which would still suffer from this bug, but maybe more clearly)… now it’s generalized
a
ha 😄
h
We can still require that the classes you register into the unions must be subtypes of the union class
Just not that all such subtypes are automatically in the union
👍 1
c
I think we can get a neat interface to unions, but it would most likely still make use of meta classes, and as such, the implementation would still be rather obscure for the uninitiated.
To me, it took a few read-throughs before grasping the context. Partly due to the separation of the union type (the base class) and its members (registered (sub)classes).
Is it desired that all union members be subclasses of the union base class, or would it be nice to have less strict requirements here..?
h
Is it desired that all union members be subclasses of the union base class
I don't think it's desired. We'd lose major flexibility, like registering something someone else wrote (pants core) to use a union you defined in your plugin, because you can't change the pants core type to subclass your union
c
Edit: Ah, it was me that misread your reply. I think you misread my question. I was pro more flexibility not less. Based on (misconception?)
We can still require that the classes you register into the unions must be subtypes of the union class
h
sorry was on my phone, updated my response to make more clear that I'm agreeing with your desire 🙂
👍 1
w
i can imagine that usecase, but i’m not sure that we’ve ever encountered it in the wild
in practice, a union member must have “some” interface (either its constructor, a static method, something), else it can’t actually be used. as it stands, that interface is assumed / ducktyped
and if you required the interface, you could make this typesafe: https://github.com/pantsbuild/pants/blob/f8e21285971cfe55481907e24a0837344417ba16/src/python/pants/engine/unions.py#L82-L104 … which would be handy.
i guess that this effectively boils down to a discussion of subtyping vs structural typing…? maybe mypy has support for structural typing?
h
they do via
Protocol
!
w
would we be making a complex thing even more complex if we required a
Protocol
? hmmm!
h
This re interfaces. The union essentially expresses subtyping, there must be some interface, and it might as well be expressed by the union type.
👍 1
w
if
@union
implied
Protocol
, or actually automatically extended it for you, maybe it wouldn’t actually increase complexity that much…
or, hell… if we renamed
@union
to (strawname)
@protocol
or
@protocol_base
or something
a downside though would be that subclassing a @union would be even less a good idea (pretty sure you’re not supposed to subclass protocols).
1
a
you’re not supposed to subclass protocols, but you still get the interface from a typechecking perspective
e
Clearly a mypy plugin could be written to deal with auto-implying / generating, however it works out. So there's always that option. Hard code, but in one spot.
Hard as in difficult, but only for us, not users.
a
(or you can implement the protocol and have a single base implementation that matches it; only slightly ugly)
c
I couldn’t find anything in PEP-544 discouraging subclassing protocols. However, Protocols are only for abstract use, i.e. you can not create instances of them. As such, it should work well with a common union base class Protocol, that you use as base for your own union types. Working on a POC of this… see if it works out, in which case I’ll post a PR upon which we can discuss if it’s any good or not.
e
IIUC the issue here is a @union'd type is the type used by the rule providing a plugin point. So that rule consumes the @union'd type and knows nothing else. Say that unioned type is `@union Cheese`and the rule providing the plugin point expects to be able to call
cheese_instance.get_milkfat_percentage
. Then all registered union members must have a method
get_milkfat_percentage
. That method name and signature is not generalizable in any base protocol. The only way it would be is if we swtiched to having a single catch-all plugin protocol, say
<http://Plugin.do|Plugin.do>(*args, *kwargs)
. That seems likely unworkable and also against one of the plugin API selling points of type-checkability.
c
You would have the
get_milkfat_percentage
in your derived union protocol.. I think this will work. example coming in the PR…
e
So you're just replacing
from typing import Protocol; class MyUnionBase(Protocol): ...
with
from pants.unions import Union; class MyUnionBase(Union):...
? Yeah - I'll check out the PR.
👍 1
c
Have a draft PR setup, but had to go abruptly, will finish up with more details tomorrow..
👍 1
w
fwiw, just found a case where it is 1) useful to not require subclassing, 2) i’d like a concrete type to describe the union (so that its members can be used as fields of some other type, 3) no methods are used. … and yea, Protocol works nicely for that case.
👍 1
c
OK, it’s a start at least. To get the discussion going 🙂 https://github.com/pantsbuild/pants/pull/12577