witty-crayon-22786
03/17/2020, 5:27 PMwitty-crayon-22786
03/17/2020, 5:27 PMPythonBinaryFields
needs to be a RootRule, 2. that it would need to be a concrete type even if it was going to come in from outside the enginewitty-crayon-22786
03/17/2020, 5:28 PMSelectFields
type, afaict.witty-crayon-22786
03/17/2020, 5:28 PMwitty-crayon-22786
03/17/2020, 5:32 PM@rule
def hydrate_multiple_fields(select_fields: SelectFields) -> SelectedFields:
hydrated_async_fields = await MultiGet(
Get[HydratedField](AsyncField, field)
for field in select_fields
if isinstance(field, AsyncField)
)
fields
for field in select_fields:
if isinstance(field, AsyncField):
hydrated = hydrated_async_fields.pop_first()
elif isisntance(field, SyncField):
hydrated = field.hydrate()
else:
raise Exception("hmmm?")
fields.append(hydrated)
return SelectedFields(fields)
witty-crayon-22786
03/17/2020, 5:33 PMhundreds-father-404
03/17/2020, 5:33 PManything you can do to hydrate a set of fields (even if some of them are not AsyncField) you can due generically in a rule that consumes a SelectFields type, afaict.I’m sure we could find some way to get a
SelectFields
type working, if we tried hard enough, but I don’t understand the motivation to have this
def rule(fields: SelectFields(PythonBinarySources, Compatibility)) -> CreatedBinary
instead of this:
class PythonBinaryFields(FieldsCombination):
required_fields = (EntryPoint, PythonBinarySources)
def rule(fields: PythonBinaryFields) -> CreatedBinary
Is it for less boilerplate to not need to create a new class?
--
Also, with SelectFields
, how do you express optional_fields
and opt_out_field
? (See https://github.com/pantsbuild/pants/pull/9316#issuecomment-600154783)witty-crayon-22786
03/17/2020, 5:35 PMwitty-crayon-22786
03/17/2020, 5:36 PMwitty-crayon-22786
03/17/2020, 5:37 PMhundreds-father-404
03/17/2020, 5:40 PMyes. afaict it is boilerplateOkay, so the reason for
SelectFields
is less boilerplate and it sounds like a conceptual argument that we should not have a distinct combination type? What again is the reason that conceptually it is wrong to have a new type registered? (Not boilerplate, I mean conceptually)
Are there other motivations for SelectFields
?witty-crayon-22786
03/17/2020, 5:41 PMWhat again is the reason that conceptually it is wrong to have a new type registered?if you used a concrete subclass of SelectFields instead, and you had two of them with completely identical contents (fields X, Y, Z for address A), you would not want to hydrate them twice. they are identical
witty-crayon-22786
03/17/2020, 5:42 PMwitty-crayon-22786
03/17/2020, 5:42 PMwitty-crayon-22786
03/17/2020, 5:43 PMSelectFields
is shorthand for the caller to avoid needing to hydrate the fields one by one themselveswitty-crayon-22786
03/17/2020, 5:44 PMhundreds-father-404
03/17/2020, 5:53 PMafaict, SelectFields is shorthand for the caller to avoid needing to hydrate the fields one by one themselvesTaking a step back, I don’t think hydration of `AsyncField`s is the right thing to optimize for. 95% of fields are
PrimitiveFields
and simply accessed by calling compatibility.value
. The only fields that need hydration are Sources
, Dependencies
, and Bundles
. (Keep in mind, we’ve been writing targets for 5 years so while we will likely add a couple more `AsyncField`s the next few years, we do have a good intuition that the vast majority of fields are PrimitiveFields
.)
Further, per https://github.com/pantsbuild/pants/pull/9306#issuecomment-599694654, we don’t even expect any rules to directly need to hydrate the Sources
field. They instead will call await Get[SourceFiles](AllSourceFilesRequest)
or await Get[SourceFiles](SpecifiedSourceFilesRequest)
Which speaks to another point, I don’t think we want the rule’s signature to give us the already hydrated values. We only want the collection of Field
instances, for a couple reasons:
1. It’s essential that we are dealing with the typed Field
for MyPy to work. There would be no way with `SelectField`s for MyPy to know that SelectFields(Compatbility, PythonSources)
gives us back Tuple[Tuple[str, ...], SourcesResult]
. We could have MyPy infer that it’s Tuple[Compatibility, PythonSources]
if we didn’t do any hydration within SelectFields
, but then we end up needing to do all hydration in the rule body, as before.
2. We don’t know how that typed Field
is going to be consumed. For example, we need to pass the Field
instance to await Get[SourceFiles](SpecifiedSourceFilesRequest((sources_field, origin))
witty-crayon-22786
03/17/2020, 5:58 PMwitty-crayon-22786
03/17/2020, 6:00 PMhundreds-father-404
03/17/2020, 6:05 PMIt feels pretty unclear at this point how and how frequently “compound field hydration” will be used, so maybe it isn’t a good time for that abstraction?Gotta run to a meeting, but keep in mind that we aren’t formalizing any abstraction. The PR very intentionally doesn’t yet add a superclass because I agree that we’re not yet ready to commit to a particular implementation. In contrast, afaict,
SelectFields
would be a new abstraction we’re commiting to.
Maybe we do at the end of the day use SelectFields
. This PR doesn’t at all block that - we would then have a diff to see with the actual code “which do we prefer?”witty-crayon-22786
03/17/2020, 6:07 PMwitty-crayon-22786
03/17/2020, 6:07 PMwitty-crayon-22786
03/17/2020, 6:07 PMhundreds-father-404
03/17/2020, 6:08 PMi’m currently leaning toward waiting until we’ve repeated ourselves a few times, and then introducing a feature like this that clearly removes steps/boilerplate/complexityAgreed. This is the approach I’m trying to take, hence this PR that does the simplest thing possible (using a plain and simple dataclass). We can clean up easily later, once we’re ready to optimize
witty-crayon-22786
03/17/2020, 6:08 PMwitty-crayon-22786
03/17/2020, 6:08 PMhundreds-father-404
03/17/2020, 6:10 PMthe simplest thing possible is to directly request the fields you need, one by oneNegative. There are currently no rules to say “Get me a Compatibility field”. That would involve defining an entirely new mechanism and a bunch of rules to say
Target -> Compatibility
.
The simplest possible implementation is that you have a plain and simple Target
and you use pure Python to say tgt.get(Compatibility)
. As a general principle, avoiding the engine is generally simpler than using itwitty-crayon-22786
03/17/2020, 6:11 PMwitty-crayon-22786
03/17/2020, 6:12 PMhundreds-father-404
03/17/2020, 6:14 PMdetermine_source_files.py
as an abstraction over hydrating Sources
already. So that leaves us Bundles
, which we’re ignoring, and `Dependencies, which is unclear what hydration should look like and sometimes won't even be necessary because its sanitizied_raw_value
is already a Tuple[Address, …]
, which is useful
adding an abstraction that allows more than one to be invoked at once seems to be what this PR is aboutSee https://pantsbuild.slack.com/archives/C0D7TNJHL/p1584467587332900?thread_ts=1584466038.329400&cid=C0D7TNJHL again. This PR is not in any way about adding sugar to hydrating `AsyncField`s. It is solely about “I have a bunch of Targets and I want to filter out the ones that can be used by this downstream rule, based on which
Fields
it has defined.”witty-crayon-22786
03/17/2020, 6:16 PMwitty-crayon-22786
03/17/2020, 8:03 PMhundreds-father-404
03/17/2020, 8:08 PMwitty-crayon-22786
03/17/2020, 8:08 PMhappy-kitchen-89482
03/17/2020, 8:09 PMwitty-crayon-22786
03/17/2020, 8:10 PMhundreds-father-404
03/17/2020, 8:10 PMIt seems like we could land this as-is so that we can make other progress on top of it todayThat would be great. I’d love to land this PR so that I can put up the followup to make
HydratedStruct -> Target
a rule and to add the targets()
end-point to register.py
witty-crayon-22786
03/17/2020, 8:10 PMhundreds-father-404
03/17/2020, 8:12 PMdoing that and then removing that boilerplate seems clearerI’m not sure what the first
that
refers to, but the plan is to put up that followup PR right away
We aren’t committing to anything with these rapid-fire Target API PRs. I’m trying to break it down into digestible chunks, which means that much of it is not the final API and is very subject to changewitty-crayon-22786
03/17/2020, 8:13 PMthat
is: "a loop over the fields i want" vs "concrete class that describes my loop"hundreds-father-404
03/17/2020, 8:14 PMTarget
, loop over all of its registered `Field`s to see if the Target
has those 2 fields and then grab those 2 fields to be consumed by the rule. Right?
The precise mechanism can change still. We only need some mechanism to be able to build off of it.witty-crayon-22786
03/17/2020, 8:15 PMGiven ayes. my feeling is that we should do that directly in consumers for now, loop over all of its registered `Field`s to see if theTarget
has those 2 fields and then grab those 2 fields to be consumed by the rule. Right?Target
witty-crayon-22786
03/17/2020, 8:15 PMhundreds-father-404
03/17/2020, 8:28 PMyes. my feeling is that we should do that directly in consumers for nowSure, we could change the PR to directly use
tgt.get(EntryPoint)
, rather than using this PythonBinaryFields
intermediate type.
But, I think we’d agree that we want the input to this rule to have fine-grained caching, that is to only depend on the typed `Field`s that are specifically needed for that rule to run. Even though this PR doesn’t change the rule’s input, the next followup does to have precisely that fine grained caching by having the rule request PythonBinaryFields
.
Agreed that we might end up dropping PythonBinaryFields
, but in the meantime, by having it, we make progress on two goals:
1. Given a target, check if it has X fields then consume those fields.
2. Have the input to the rule be only the combination of fields needed to run for better caching.
If we remove PythonBinaryFields
and directly call tgt.get(EntryPoint)
in the rule body, then we would at least be solving goal #1 and it would be some progress to land. With the current PR, we would be solving both #1 and #2, which is even more progress (even though we might change the mechanism for how we do that)witty-crayon-22786
03/17/2020, 8:30 PMwitty-crayon-22786
03/17/2020, 8:30 PMget
for all fieldshundreds-father-404
03/17/2020, 8:34 PM@rule
async def convert_python_binary_target(adaptor: PythonBinaryAdaptor) -> PythonBinaryFields:
wrapped_tgt = await Get[WrappedTarget](Address, adaptor.address)
return PythonBinaryFields.create(wrapped_tgt.target)
Then, have create_python_binary
take the param PythonBinaryFields
rather than PythonBinaryTargetAdaptor
.
That change is what I mean by goal #2 of fine grained caching. Currently, with create_python_binary
using PythonBinaryTargetAdaptor
as its param, that means that any changes to the target adaptor will invalidate the rule’s cache, even if it’s irrelevant.
We could alternatively achieve this with your proposal of changing the rule signature to
def create_python_binary(fields: SelectFields(EntryPoint, PythonBinarySources)
as you suggest. Maybe we end up tomorrow refactoring this to use SelectFields
. Even if we throw away PythonBinarySources
tomorrow, this gets us closer to achieving goal #2 and helps to guide the discussionwitty-crayon-22786
03/17/2020, 8:37 PMhundreds-father-404
03/17/2020, 8:38 PMPythonBinaryFields
vs. SelectFields
, we need to stop requesting PythonBinaryAdaptor
in the rule signature)witty-crayon-22786
03/17/2020, 8:38 PMhundreds-father-404
03/17/2020, 8:39 PMwhich is fine, but why can’tBecause the rule signature is what matters. The motivation is to getjust inline those two lines?create_python_binary
create_python_binary
to stop requesting PythonBinaryAdaptor
in its signature to instead request in its signature only the Typed Fields it needs. The rule needs to be able to work with targets like MyCustomPythonBinary
hundreds-father-404
03/17/2020, 8:40 PMthe abstraction in general might be premature if the alternative is to inline those linesI disagree because of the above points about needing to be extensible / work with custom target types, and to have fine grained caching. We must have some mechanism for the rule’s signature to say “I only need these 2 fields”. That cannot be inlined in the downstream rule. We’re describing two different approaches to that mechanism, but either way, we need a mechanism to do this.
witty-crayon-22786
03/17/2020, 8:42 PMWe must have some mechanism for the rule’s signature to say “I only need these 2 fields”.this might be a root of the misunderstanding: ... it does not need to be part of the signature of the rule if it is declared in the Gets for the rule (ie by inlining that code).
witty-crayon-22786
03/17/2020, 8:44 PMwitty-crayon-22786
03/17/2020, 8:44 PMhundreds-father-404
03/17/2020, 8:45 PMthis might be a root of the misunderstanding: ... it does not need to be part of the signature of the rule if it is declared in the Gets for the rule (ie by inlining that code).Sure, fwit, the plan is for `@goal_rule`s to request
Targets
. This is about what makes sense for downstream rules.
I think it’s much simpler for create_python_binary
to say “I need these 2 fields, get it for me through whatever engine magic” than for it to say “get me a generic Target and I’ll try to do something meaningful with it. If you happened to give me a `PythonTests`…throw an exception? no-op?”hundreds-father-404
03/17/2020, 8:46 PMa goal could request unhydrated targets, and then only hydrate the ones it cares about, field at a time.Agreed, this is precisely the plan
witty-crayon-22786
03/17/2020, 8:46 PMAgreed, this is precisely the planthen let's land that, exactly, and follow up with improved abstraction, i think
witty-crayon-22786
03/17/2020, 8:58 PMI think it’s much simpler forcreate_python_binary
to say “I need these 2 fields, get it for me through whatever engine
magic” than for it to say “get me a generic Target and I’ll try to do
something meaningful with it. If you happened to give me a `PythonTests`…throw an exception? no-op?”i'm not suggesting raising exceptions. i'm suggesting having create_python_binary filter to the targets that have the field.
witty-crayon-22786
03/17/2020, 8:59 PMhundreds-father-404
03/17/2020, 9:01 PMhundreds-father-404
03/17/2020, 9:01 PMwitty-crayon-22786
03/17/2020, 9:06 PMwitty-crayon-22786
03/17/2020, 9:06 PMhundreds-father-404
03/17/2020, 9:07 PMwitty-crayon-22786
03/17/2020, 9:07 PMhundreds-father-404
03/17/2020, 9:08 PMwitty-crayon-22786
03/17/2020, 9:08 PMwitty-crayon-22786
03/17/2020, 9:09 PMwitty-crayon-22786
03/17/2020, 9:09 PMwitty-crayon-22786
03/17/2020, 9:10 PMhundreds-father-404
03/17/2020, 9:11 PMhundreds-father-404
03/17/2020, 9:45 PMhundreds-father-404
03/18/2020, 2:50 AMTargets
. Finishes changing binary
and run
to exclusively use the Target API!
As expected, it causes some good changes from #9316. One of them is that I was wrong about PythonBinaryFields
being a RootRule
. Instead, it’s a union member
Also @witty-crayon-22786 tomorrow at 8 am works. My friend is going straight home because of a snow storm at her home in Flagstaff tonight that they want to avoid.