<@U06A03HV1> <https://pantsbuild.slack.com/archive...
# development
a
@witty-crayon-22786 https://pantsbuild.slack.com/archives/C0D7TNJHL/p1678315016165419?thread_ts=1678304144.161169&amp;cid=C0D7TNJHL I think we may want to split out the “inherent” environment name and “supplied” environment names as a concept, as I’m seeing a few classes of problems that arise from them being conceptually the same
w
hm. “inherent” is only a thing in RuleRunner
to avoid having to rewrite all RuleRunner tests to be environment aware
a
By “inherent”, I mean the one that’s attached to the rule subgraph
w
yea, that’s something else.
a
It didn’t have a name I could quickly draw upon
is “embedded” a good word?
w
ok, so then ignoring
RuleRunner
, you’re thinking of “supplied by a `Get`” vs “supplied by a `Query`” ?
a
So at the moment, there’s the
EnvironmentName
that would get automatically passed in as a param to the rule if one were not supplied explicitly, and there’s the
EnvironmentName
that gets supplied explicitly in a
Get
w
the former is “supplied by a `Query`”, yea.
a
Cool, I’ll call that the “Embedded” environment for brevity
w
i.e., when invoking the engine with
product_request
a
There are cases where the
EnvironmentName
needs to be passed in explicitly (Josh’s example;
BuiltPackage
, etc), but because we don’t separate the concepts of an explicitly-supplied environment and the Embedded environment, you can call the rule without supplying an
EnvironmentName
, it’ll work, and it’ll be wrong
w
well, backing up a step: that’s actually because of the RuleRunner’s default arguments.
and it’s to avoid needing to change ~all tests
a
(and it’s wrong in subtle ways, as evidenced by the
BuiltPackage
bugs that we experienced)
w
i don’t think that it’s related to the
BuiltPackage
bugs… but i could be wrong.
a
That’s why I’m saying there are classes of bugs that I feel have a similar root cause
(Latent environment is the word I’m looking for)
w
maybe. RuleRunner having a default environment present is different from production usage, where usually “the environment that is in scope” is the one that you want to be using 98% of the time. the 2% boundary cases (i’m building for this environment, but this dependency needs to be built for that environment) are the tricky ones. a nascent idea is to use param “masking” to prevent the EnvironmentName from being passed through an API. but you still have to choose where to do that: it’s opt out.
if it were opt-in rather than opt-out, we would have to change ~all
@rule
definitions everywhere to indicate that they want to pass the
EnvironmentName
through
a
I actually think the answer is somewhat simpler, which is that it’s kept opt-out, but we mark `EnvironmentName`/`EnvironmentTarget` as “playing with fire”, and add in a
SuppliedEnvironmentName
type, which we use for things like the
BuiltPackage
boundary
So that you can’t trigger the rule with the latent
EnvironmentName
param. You need to explicitly construct one
(but that doesn’t automatically pass the param down the subgraph…
like I said, half-baked)
w
hm… maybe. that’s akin to changing the API of that
@union
to specify that
SuppliedEnvironmentName
will be in scope: https://github.com/pantsbuild/pants/blob/f5710a2856d73d2c68c06298d87da11b49c5a8bb/src/python/pants/core/goals/package.py#L38 or removing the
EnvironmentName
from the
@union(in_scope_types=..)
, and requiring that
PackageFieldSet
implementations have and use an
environment
field.
a
The want is that
SuppliedEnvironmentName
would be needed to invoke the first rule, but that’s what provides the in-scope
EnvironmentName
from then on
basically, some sort of way for rules to say that they’re environment-aware, but that they can’t be called with a latent environment. (But the environment that they’re called with becomes the latent environment for rules that it calls)
w
that is exactly what the
@union(in_scope_types=..)
is supposed to be for.
a
But it doesn’t seem to work that way in this case?
You can declare
BuiltPackage
implementers as out of scope, but then every rule that you call has to explicitly supply an environment
w
it does: we’ve just explicitly said “an EnvironmentName will be present here”. if you wanted some other type to be present (or remove the guarantee), you could remove it from that list.
yes.
a
Right, I want it to be “An EnvironmentName will be present here and it must be one that was asked for by a `Get`”
w
mm. yea,
in_scope_types
could probably be tweaked to enforce that.
…or a different list of types could be added, probably.
see https://github.com/pantsbuild/pants/pull/16717 for the ticket that implemented it
a
I think it would also be good to be able to steer people away from using `EnvironmentName`/`EnvironmentTarget` in their rule code where possible, but I need to think more about use cases
c
the
PackageFieldSet
has the
in_scope_types
defined, if we remove that, wouldn’t that require all rules dealing with it to also explicitly provide the
EnvironmentName
? https://github.com/pantsbuild/pants/blob/f5710a2856d73d2c68c06298d87da11b49c5a8bb/src/python/pants/core/goals/package.py#L38-L40
no need then for a new type…
w
yes. that is what i was suggesting above, but Chris is right that what we want is the caller to have to supply it… just explicitly, rather than implicitly
c
isn’t it the caller already that has to supply it?
a
no
w
it just has to be in scope… either explicitly or implicitly.
a
Bascially, at the moment, you either can get it automatically (which leads to bugs), or the caller supplies it (which leads to every single
Get
having to supply
EnvironmentName
, which is messy)
w
you either can get it automatically (which leads to bugs)
to be clear: it’s still the case that in 98% of positions, you want it implicitly/automatically. i think that it will only be some `@union`s where that is not the case.
a
and when I say “it leads to bugs”, I mean “it can lead to bugs in cases where the given rule actually needs to be at the boundary
c
So, if we don’t have the
in_scope_types
it would be:
Copy code
built_package = await Get(BuiltPackage, {fs: PackageFieldSet, env: EnvironmentName})"
right?
w
no: if you don’t have the in_scope_types, it could never be present in the callee***.
callee*
c
oh… so even if you provide it explicitly it’s not used
w
they would need to compute one themselves
um. i’m not sure actually.
a
No, it’ll be used, at minimum if it’s explicitly a parameter in the rule you call
but if it is in scope, then it’ll get attached to the rule, even if it’s not used as a parameter
(reasoning about environments is a mindfuck; sorry)
c
isn’t the example above what we’d want to work, though?
afk:dinner
isn’t the example above what we’d want to work, though?
yes. i think that another way to have implemented
@union(in_scope_types=..)
would have been to make it an argument to
Get
instead.
a
But then we’d be passing `EnvironmentName`s around everywhere, and that’s not great
w
i.e. imaging a syntax like
built_package = await Get(BuiltPackage, {fs: PackageFieldSet}, implicitly=[EnvironmentName])
well, sortof. it would move the “this
@union
usage provides the `EnvironmentName`” declaration from
@union
onto the `Get`s that used the
@union
.
a
the problem with that is that it doesn’t cover the
BuiltPackage
case where a rule wants you to explicitly specify the environment
unless a rule can mark a parameter as must be explicit
w
@ancient-vegetable-10556: as Andreas said, that is already expressed by
await Get(BuiltPackage, {fs: PackageFieldSet, env: EnvironmentName})
it’s not the callee that declares the interface here: it’s the caller
or, that’s one option, i suppose.
i guess that
@union(in_scope_types=..)
is more like the callee declaring the interface, vs the caller.
👍 1
c
I see what @ancient-vegetable-10556 means though, that my example is what how it would be used, but the callee wants some syntax to declare how it’s supposed to being called, so we can catch callers not doing the right thing.
but, is there some boundary across unions that isn’t present between regular rules and their `Get`s?
say I have the rules A -> B -> C. If E is present going into A, and then later needed as param to C, that would work and B wouldn’t have to provide it, right. But this doesn’t seem to be true when it comes to unions (unless they use the in scope types thing)
a
That is currently what’s going on
(if in scope types is used)
w
correct. `@union`s express a particular limited API
the PR that added
in_scope_types
resolved https://github.com/pantsbuild/pants/issues/12934
c
I agree with the comment about keeping `Get`s uniform and unaware of
@union
specifics.. need to ponder a bit longer on what the unions bring for pecularities that gives them a “particular limited API”
w
one way to think about it is that after rule graph solving, a
@rule
ends up with an expanded signature that includes all of the dependency parameters that it needs transitively… on the other hand, use of a
@union
effectively has a statically declared signature (this output type for exactly these input types)
c
ohoh… right. thanks that makes a lot of sense now!
since the union effectively is a rule graph branching mechanism decided in runtime
w
yea
c
so… just throwing weird ideas around, what if the input params could be satisfied post-fact, i.e. when needed rather than up front before running a rule?
w
what do you mean by “satisfied”?
c
what is in scope, I guess.
realize that maybe what I just said would be equivalent to have everything in scope everywhere all the time, which maybe is undesirable..
w
the reason that we need to pre-compute the transitive dependencies of a
@rule
is that they become part of the “key”/“identity” for the `@rule`: that’s how memoization works.
c
devil in details 😛
w
for example: we need to know that a
@rule
consumes the environment, so that the environment can be included in its identity, and two copies are created for two different environments
c
ah, and in there lay the answer to the unions “limited API” I guess..