https://pantsbuild.org/ logo
#development
Title
# development
h

happy-kitchen-89482

08/11/2021, 6:37 PM
Is this error expected?
Copy code
Exception: Invalid Get. Because the second argument to `Get(BuiltPackage, PythonDistributionFieldSet, PythonDistributionFieldSet(address=Address(testprojects/src/python/hello/native:dist), provides=<class 'pants.backend.python.target_types.PythonProvidesField'>(alias='provides', address=testprojects/src/python/hello/native:dist, value=<pants.backend.python.macros.python_artifact.PythonArtifact object at 0x107324b38>, default=None)))` is annotated with `@union`, the third argument should be a member of that union. Did you intend to register `UnionRule(PythonDistributionFieldSet, PythonDistributionFieldSet)`? If not, you may be using the wrong type (PythonDistributionFieldSet) for the third argument.
PythonDistributionFieldSet is not an @union, but its superclass (PackageFieldSet) is.
I would have expected to be able to request for a specific member of a union, ignoring the union entirely, no?
h

hundreds-father-404

08/11/2021, 6:38 PM
Yes, unfortunately. Limiation of Python itself: when you subclass a union base like
PackageFieldSet
, you inherit the
@union
stuff. So the engine naively thinks that your subclass is a union base too I don't know how to fix this. In Python, how do you have a superclass have a property but not any subclasses?
This has been a big pain point, it understandably confused @ancient-vegetable-10556 last week
a

ancient-vegetable-10556

08/11/2021, 6:40 PM
I have a vague idea, but it’s really really ugly
Copy code
def union(cls, name: str):
   if cls.__name__ != str:
      return cls
   
   # do actual decoration here
   return cls
^--- this is not actually correct, as
union()
would need to return a decorator function, but the idea is that you give
@union
enough information to figure out if the decorator should bail for subclasses.
👍 1
It definitely makes the
@union
decorator less eggplant though
🍆 1
errrm elegant, rather
🤣 1
h

hundreds-father-404

08/11/2021, 6:45 PM
that's totally fine imo, the decorator is encapsulated. More important is that it works properly than that the implementation is elegant
a

ancient-vegetable-10556

08/11/2021, 6:45 PM
Copy code
@union("PackageFieldSet")
class PackageFieldSet(FieldSet, etc etc etc):
   etc
h

hundreds-father-404

08/11/2021, 6:46 PM
Yeah that's acceptable imo. Unions are not typically defined by plugin authors, more by us so that we can give plugin hooks for people to write plugins
a

ancient-vegetable-10556

08/11/2021, 6:46 PM
@hundreds-father-404 Write a ticket, and I’ll take a stab at it. There’s probably a better solution in there somewhere.
ticket doesn’t need to be detailed, just a reminder of why I’m doing this, and an example of where I could use it
h

hundreds-father-404

08/11/2021, 6:47 PM
not exactly the same thing, but related enough: https://github.com/pantsbuild/pants/issues/10618
(another issue with unions, but separate thing: https://github.com/pantsbuild/pants/issues/11430)
c

curved-television-6568

08/11/2021, 6:52 PM
Copy code
def union(cls, name: str):
   if cls.__name__ != str:
      return cls
   
   # do actual decoration here
   return
Would this actually work? The decorator is on the base class only, child classes would still inherit the “actual decoration” from the base…
a

ancient-vegetable-10556

08/11/2021, 6:53 PM
Yeah — you can bail out of the decorator before you do the actual decoration setp
so while the decorator is still invoked, you can make it a no-op for the subclasses
👍 1
it’s just making sure that you have enough info to bail out in an informed manner
c

curved-television-6568

08/11/2021, 6:54 PM
Ah, ok. Will have to look at the code to see where this lives, then. ;)
(On the phone…)
a

ancient-vegetable-10556

08/11/2021, 6:55 PM
I’m having a go at implementing this now, as it’s frustrating enough and my brain is in the right place 🙂
🚀 1
🙌 1
w

witty-crayon-22786

08/11/2021, 6:59 PM
also, definitely consider … completely overhauling unions instead!
1
there are many awkward bits.
a

ancient-vegetable-10556

08/11/2021, 6:59 PM
Yeah, this is at least a useful first step
(should it work)
It doesn’t, but I see something that I was previously missing
h

happy-kitchen-89482

08/11/2021, 8:30 PM
Have sometimes wondered if we could replace unions with subclassing (in many cases the union members do in fact subclass the union class)
a

ancient-vegetable-10556

08/11/2021, 8:32 PM
Yeah, that would certainly have made the thing I was doing that involved unions a bit easier
Interestingly here, I think there may be an actual bug in how we’re using the union decorator
w

witty-crayon-22786

08/11/2021, 8:34 PM
Have sometimes wondered if we could replace unions with subclassing…
@enough-analyst-54434 has explored that. but i think that a really important portion to preserve is optional registration. so you wouldn’t actually be able to use the inheritance hierarchy to decide which instances were registered.
(example: in my unit test of one FieldSet subclass, i shouldn’t get a bunch of other FieldSets pulled in)
h

happy-kitchen-89482

08/11/2021, 8:36 PM
Fair point.
w

witty-crayon-22786

08/11/2021, 8:36 PM
https://github.com/pantsbuild/pants/issues/11354 discusses some subclassing benefits… but not from the point of view of registration
h

happy-kitchen-89482

08/11/2021, 8:36 PM
But this is essentially subtyping, no? If so the word
union
is a little confusing
w

witty-crayon-22786

08/11/2021, 8:37 PM
in most cases, yes: the member is a subtype.
but yes: you almost always have a particular interface for the
@union
member (to construct it/call it/etc), so requiring that it is a subtype would be reasonable. it’s just somewhat orthogonal to the registration problem
h

happy-kitchen-89482

08/11/2021, 8:49 PM
What's the workaround here, meanwhile?
w

witty-crayon-22786

08/11/2021, 8:59 PM
fixing the
@union
decorator, probably. @ancient-vegetable-10556: are you still looking at that?
h

happy-kitchen-89482

08/11/2021, 9:05 PM
Well, I can temporarily work around by requesting the union instead of the subclass, knowing in practice that I will only get instances of the subclass back
a

ancient-vegetable-10556

08/11/2021, 9:35 PM
@witty-crayon-22786 Yeah, I spent some time digging, and it looks like there’s some weird interplay between how
@union
works and how we resolve rules in the graph
c

curved-television-6568

08/11/2021, 10:09 PM
I find this topic intriguing, will play around with it a bit, see what I learn… (next week)
❤️ 2
a

ancient-vegetable-10556

08/11/2021, 10:09 PM
@curved-television-6568 I’d love to have a chat about it. It seems like there’s some attempt to do weird stuff with the type system that doesn’t quite work
👍 1
c

curved-television-6568

08/11/2021, 10:10 PM
I’ll reach out to you some day next week, then :)
h

hundreds-father-404

08/11/2021, 10:11 PM
Yeah I don't think registration should be via subclassing. This would break allowing you to write intermediate util classes, as those would now be union members unintentionally I agree with Stu that you should continue to be explicit in registering via UnionRule(Base, Member), rather than some implicit registration based on subclassing But yes, we should still fix the weirdness around unions
w

witty-crayon-22786

08/11/2021, 10:27 PM
@ancient-vegetable-10556, @curved-television-6568: apart from the metaclass magic that obscures things, the implementation of unions is surprisingly simple: it boils down to converting a
Get(X, Y, ..)
(where
Y
is a union) into a series of
Get(X, $member, ..)
for the union members… which happens here: https://github.com/pantsbuild/pants/blob/731c855630a9eed6c68fdf6129541dc6a69ec3e1/src/python/pants/engine/internals/scheduler.py#L658-L666
👍 1
a

ancient-vegetable-10556

08/11/2021, 10:28 PM
I think the obscurification is the problem here
w

witty-crayon-22786

08/11/2021, 10:29 PM
yea. the
@union
decorator used to literally just poke a
_union
field onto a class (which would still suffer from this bug, but maybe more clearly)… now it’s generalized
a

ancient-vegetable-10556

08/11/2021, 10:29 PM
ha 😄
h

happy-kitchen-89482

08/12/2021, 1:31 AM
We can still require that the classes you register into the unions must be subtypes of the union class
Just not that all such subtypes are automatically in the union
👍 1
c

curved-television-6568

08/12/2021, 6:44 AM
I think we can get a neat interface to unions, but it would most likely still make use of meta classes, and as such, the implementation would still be rather obscure for the uninitiated.
To me, it took a few read-throughs before grasping the context. Partly due to the separation of the union type (the base class) and its members (registered (sub)classes).
Is it desired that all union members be subclasses of the union base class, or would it be nice to have less strict requirements here..?
h

hundreds-father-404

08/12/2021, 7:57 AM
Is it desired that all union members be subclasses of the union base class
I don't think it's desired. We'd lose major flexibility, like registering something someone else wrote (pants core) to use a union you defined in your plugin, because you can't change the pants core type to subclass your union
c

curved-television-6568

08/12/2021, 8:13 AM
Edit: Ah, it was me that misread your reply. I think you misread my question. I was pro more flexibility not less. Based on (misconception?)
We can still require that the classes you register into the unions must be subtypes of the union class
h

hundreds-father-404

08/12/2021, 8:15 AM
sorry was on my phone, updated my response to make more clear that I'm agreeing with your desire 🙂
👍 1
w

witty-crayon-22786

08/12/2021, 4:39 PM
i can imagine that usecase, but i’m not sure that we’ve ever encountered it in the wild
in practice, a union member must have “some” interface (either its constructor, a static method, something), else it can’t actually be used. as it stands, that interface is assumed / ducktyped
and if you required the interface, you could make this typesafe: https://github.com/pantsbuild/pants/blob/f8e21285971cfe55481907e24a0837344417ba16/src/python/pants/engine/unions.py#L82-L104 … which would be handy.
i guess that this effectively boils down to a discussion of subtyping vs structural typing…? maybe mypy has support for structural typing?
h

hundreds-father-404

08/12/2021, 4:58 PM
they do via
Protocol
!
w

witty-crayon-22786

08/12/2021, 5:01 PM
would we be making a complex thing even more complex if we required a
Protocol
? hmmm!
h

happy-kitchen-89482

08/13/2021, 4:54 AM
This re interfaces. The union essentially expresses subtyping, there must be some interface, and it might as well be expressed by the union type.
👍 1
w

witty-crayon-22786

08/13/2021, 8:06 PM
if
@union
implied
Protocol
, or actually automatically extended it for you, maybe it wouldn’t actually increase complexity that much…
or, hell… if we renamed
@union
to (strawname)
@protocol
or
@protocol_base
or something
a downside though would be that subclassing a @union would be even less a good idea (pretty sure you’re not supposed to subclass protocols).
1
a

ancient-vegetable-10556

08/13/2021, 8:24 PM
you’re not supposed to subclass protocols, but you still get the interface from a typechecking perspective
e

enough-analyst-54434

08/13/2021, 8:25 PM
Clearly a mypy plugin could be written to deal with auto-implying / generating, however it works out. So there's always that option. Hard code, but in one spot.
Hard as in difficult, but only for us, not users.
a

ancient-vegetable-10556

08/13/2021, 8:25 PM
(or you can implement the protocol and have a single base implementation that matches it; only slightly ugly)
c

curved-television-6568

08/16/2021, 1:12 PM
I couldn’t find anything in PEP-544 discouraging subclassing protocols. However, Protocols are only for abstract use, i.e. you can not create instances of them. As such, it should work well with a common union base class Protocol, that you use as base for your own union types. Working on a POC of this… see if it works out, in which case I’ll post a PR upon which we can discuss if it’s any good or not.
e

enough-analyst-54434

08/16/2021, 1:48 PM
IIUC the issue here is a @union'd type is the type used by the rule providing a plugin point. So that rule consumes the @union'd type and knows nothing else. Say that unioned type is `@union Cheese`and the rule providing the plugin point expects to be able to call
cheese_instance.get_milkfat_percentage
. Then all registered union members must have a method
get_milkfat_percentage
. That method name and signature is not generalizable in any base protocol. The only way it would be is if we swtiched to having a single catch-all plugin protocol, say
<http://Plugin.do|Plugin.do>(*args, *kwargs)
. That seems likely unworkable and also against one of the plugin API selling points of type-checkability.
c

curved-television-6568

08/16/2021, 1:55 PM
You would have the
get_milkfat_percentage
in your derived union protocol.. I think this will work. example coming in the PR…
e

enough-analyst-54434

08/16/2021, 2:03 PM
So you're just replacing
from typing import Protocol; class MyUnionBase(Protocol): ...
with
from pants.unions import Union; class MyUnionBase(Union):...
? Yeah - I'll check out the PR.
👍 1
c

curved-television-6568

08/16/2021, 10:13 PM
Have a draft PR setup, but had to go abruptly, will finish up with more details tomorrow..
👍 1
w

witty-crayon-22786

08/16/2021, 10:44 PM
fwiw, just found a case where it is 1) useful to not require subclassing, 2) i’d like a concrete type to describe the union (so that its members can be used as fields of some other type, 3) no methods are used. … and yea, Protocol works nicely for that case.
👍 1
c

curved-television-6568

08/17/2021, 1:41 PM
OK, it’s a start at least. To get the discussion going 🙂 https://github.com/pantsbuild/pants/pull/12577