Hey <@U06A03HV1>, I've been thinking a lot about e...
# development
h
Hey @witty-crayon-22786, I've been thinking a lot about environment matchers and want to make sure I understand the motivating use cases:
I thought of two situations where you want "fallbacks" aka "fuzzy matching", beyond what we already have with `__local__`: 1. remote execution gets disabled via the global option. You may want to use an equivalent Docker image locally, or be fine using local 2. when using Docker locally, Jacob Floyd has explained he wants to use local if possible (i.e. on Linux) Am I missing anything? That covers the user stories I came up with at the start of the project, and your motivation in the appendix:
In order to transparently support remote execution where possible (while falling back to other environments if necessary)
w
yea, that sounds right. essentially, allowing tests/packages to be lenient about where they run/are-built.
👍 1
allowing / preferring
👍 1
h
Given the above that we only have two known cases for environments wanting fallbacks (remote execution disabled, and docker preferring local if possible), design for how to do this:
the key insight for me was that ~95% of use cases w/ environments is consuming them by setting
environment
. Much less common is defining the environment. So, optimize for that 95% use case being simple Rather than inventing a new DSL for the
environment
field, like
platform:linux_x86_64
and
image:centos6
, add this to the environment target definitions:
Copy code
docker_environment(prefer_local_environment=True)
remote_environment(fallback="centos6")
remote_environment(fallback="__local__")  (the default)
remote_environment(fallback=None)  (i.e. error)
Then consuming targets still set
environment="centos6"
or
environment="remote"
, while getting the benefits of fallbacks
This gives less flexibility for consumers to choose different fallback behaviors, but a) I doubt that is a common requirement, and b) it is still possible by creating 2+
docker_environment
or
remote_environment
targets with different fallback behaviors. (They can use Python syntax in BUILD files or defaults for DRY)
A small but neat benefit of this approach rather than the
environment
field DSL is that it allows you to express "prefer RE, then fallback to local if a compatible platform, else Docker" (thanks to
prefer_local_environment
)
cc @ancient-vegetable-10556 @happy-kitchen-89482 I think fortunately this design would be very easy to implement
👍 1
ohhh and I think this design is very future proof with speculation! You can enable speculation by changing the
remote_environment
targets, maybe w/ a global default via options -- no need to update call sites
w
thanks!
A small but neat benefit of this approach rather than the
environment
field DSL is that it allows you to express “prefer RE, then fallback to local if a compatible platform, else Docker” (thanks to
prefer_local_environment
)
that would be equivalent to defining three environments, and then matching them in order, i think?
basically: i think that this would result in a linked list of environments, where if there was order to their definitions, it would directly be a list (in the declared order)
h
I think fortunately this design would be very easy to implement
Woot, took a whole of 15 minutes to implement remote environment 🚀 https://github.com/pantsbuild/pants/pull/16955 docker should be easy to add before logging off tonight, if we like the design
w
but speculation actually maybe makes it look more like a tree than a linked list? so maybe that’s an advantage?
but speculation actually maybe makes it look more like a tree than a linked list? so maybe that’s an advantage?
well… no, maybe not. you don’t need a tree… rather, two lists of environments to speculate with.
h
that would be equivalent to defining three environments, and then matching them in order, i think?
You need a syntax to express "only Docker if you can't use local". So I don't think ordering that list as RE -> Local -> Docker would work, because local always is a match
w
You need a syntax to express “only Docker if you can’t use local”. So I don’t think ordering that list as RE -> Local -> Docker would work, because local always is a match
right: the syntax i had suggested in the Appendix was something like
platform: linux
yea, i think that this will work, in the sense that you still end up with a list that you match in order. it seems like it might allow for more different lists to be encoded though (which is probably good?), because each list head that you start walking from might send you somewhere different
h
@witty-crayon-22786 what's the advantage of having each consumer use this new DSL, vs the approach in https://github.com/pantsbuild/pants/pull/16955?
i think that this will work
what does "this" mean here?
w
what does “this” mean here?
your design
h
ah, cool 😄
because each list head that you start walking from might send you somewhere different
I was thinking only adding
fallback_environment
to
remote_environment
. so you could go RE -> RE -> RE -> Docker -> Local, theoretically, but that seems weird For Docker, it doesn't seem necessary to fallback to another Docker env? And falling back to an RE env seems weird, that you'd prefer RE. Hence why I was thinking Keep It Simple and have
prefer_local_environment: bool
w
@witty-crayon-22786 what’s the advantage of having each consumer use this new DSL, vs the approach in https://github.com/pantsbuild/pants/pull/16955?
there may not be one. i think that the impact of your design is that the environment
name
becomes even more central, because it essentially becomes a bit like a strategy name: your
centos6
environment might actually be defined as “remote centos6 if available, then local iff centos6, then centos6 docker image”
💯 1
For Docker, it doesn’t seem necessary to fallback to another Docker env? And falling back to an RE env seems weird, that you’d prefer RE. Hence why I was thinking Keep It Simple and have
prefer_local_environment: bool
where does that field go?
(i do think that having to pre-decide which fallbacks are “legal/reasonable” ahead of time is vaguely a risk… but at the extreme, all environments might end up with some sort of fallback eventually if we find out about enough use-cases. so not a blocker)
👍 1
h
cool, I'm thinking the same thing 🙂
where does that field go?
ah,
docker_environment(prefer_local_environment: bool)
, which means that if the local Platform ==
docker_environment(platform)
field, then use
__local__
then local iff centos6
This part won't be as powerful because I don't think we have enough information from a
docker_environment
target to match things like OS version. We only have the
DockerPlatformField
+
Platform.create_for_localhost()
to go off of We could add more fields to
docker_environment
to make the matching more precise, if we found there was a need. Straw design,
prefer_local_if_is_os=("centos", "6.*")
w
We could add more fields to
docker_environment
to make the matching more precise, if we found there was a need. Straw design,
prefer_local_if_is_os=("centos", "6.*")
that gets into the matching syntax… but matching inside the targets and giving them names might still be better from a user interface perspective of
environment=…
being simple.
👍 1
h
Agreed. I'm thinking we start with only matching on Platform for
prefer_local_environment
for Docker, which meets the use case Jacob gave us. It defaults to False, and our
help
will warn it solely matches by platform I'm happy to name that field something conservative to mention the platform, so we can reserve more powerful matching in the future
w
one thing with
prefer_local
is that’s it’s a bit odd to have the edge defined in the other direction:
remote_environment
defines an outbound edge, this is an inbound edge
if your platform matching syntax ended up being powerful enough, then you could put it on
local_environment
, and have that fall back to docker if need be?
i.e., start by matching “is centos6” against the local environment, then fall back to docker if not
h
That's interesting I was about to say I think we want to consider what makes sense for consuming targets to set in BUILD files, but I realized that's pretty irrelevant. Consumers only need to know about the
name
from
[environments].preview
. It doesn't really matter how the name
centos6_or_local_linux
is defined, that's abstracted from consumers What matters more is gently encouraging the environment authors to use good names
one thing with prefer_local is that’s it’s a bit odd to have the edge defined in the other direction: remote_environment defines an outbound edge, this is an inbound edge
weird in what way? my head story is
using Docker is a niche feature, whereas local environment is the baseline. Have the niche config live on the Docker target
I'm not following also how it's different than
remote_environment(fallback_environment)
? It's the same general idea of "Maybe use this env, but fallback to X if possible or if required"
w
It’s the same general idea of “Maybe use this env, but fallback to X if possible or if required”
it’s the inverse: it’s “use that environment, but fall back to this one if not possible”
👍 1
docker might be niche, but having the edges flowing in different directions would seem to make things more challenging to explain
h
how so?
w
@hundreds-father-404: well, you got confused by what i meant by inbound/outbound edges… and i got confused by the design initially (probably at least in part due to this)
it’s just inconsistent i think.
more generally: i think that for the
local_environment
to grow more useful over time, the matching syntax you described (
prefer_local_if_is_os
, etc) would also need to be on the
local_environment
: i.e.: use this local environment if centos6, else that local environment
h
i think that for the local_environment to grow more useful over time, the matching syntax you described (prefer_local_if_is_os, etc) would also need to be on the local_environment
agreed. I think we could probably avoid that
prefer_local_if_is_os
as scope creep until a user needs it. but we should make sure the design can accomdate it
w
er: to be clear: i was saying that that is a motivation to put the definition of the match on the
local_environment
, not on the
docker_environment
h
oh btw, semi related update: I was planning on no longer erroring if a
local_environment
isn't defined for your current platform. That means just use the subsystem options as a default otherwise, if you define 1 local env for e.g. M1s, we'd be forcing you to define for everything else. which may not be necessary
er: to be clear: i was saying that that is a motivation to put the definition of the match on the local_environment, not on the docker_environment
ah, yeah, we'd need to replace
compatible_platforms
on
local_environment
with something more powerful
w
which relates a bit to the fact that the
___local___
environment is not currently necessarily the beginning of a chain: we start by matching one via the platform
So if you had two local Linuxes and wanted to prefer one to the other, we'd need a different resolution for
__local__
h
so, with the current simplistic Platform matching, this would look like?
Copy code
local_environment(
   compatible_platforms=["linux_x86_x64"],
   fallback_if_not_valid="centos6",
)
Copy code
python_tests(environment="linux_or_centos6_docker")
So if you had two local Linuxes and wanted to prefer one to the other, we'd need a different resolution for local
What does that mean?
w
(or maybe we get rid of
___local___
, and it's always an explicit name?
h
(or maybe we get rid of local, and it's always an explicit name?
I think we need
__local__
to handle the user story
macOS vs Linux Python interpreter config
https://docs.google.com/document/d/1vXRHWK7ZjAIp2BxWLwRYm1QOKDeXx02ONQWvXDloxkg/edit#heading=h.3mi2qi3bz335
w
> So if you had two local Linuxes and wanted to prefer one to the other, we’d need a different resolution for local
What does that mean?
if you define a
local_environment
that matches centos6, and another
local_environment
for if not centos6, then currently things would explode
I think we need
__local__
to handle the user story
macOS vs Linux Python interpreter config
that could be handled by chaining from an explicit first node
h
Okay I'm having a hard time following this tbh. It might make more sense to me tomorrow morning, but would you mind maybe sketching it out with arrows, like
local Linux x86 -> Docker centos6
?
w
strawsyntax but:
Copy code
local_environment(name="default", platform="macOS", fallback_to="linux", ..)
local_environment(name="linux", platform="linux", ..)
…and then your default environment would be
default
👍 1
It might make more sense to me tomorrow morning, but would you mind maybe sketching it out with arrows, like
local Linux x86 -> Docker centos6
?
Copy code
local_environment(name="default", platform="linux", more_specifically="centos6", fallback_to="any_linux")
local_environment(name="any_linux", platform="linux")
h
I think that design would violate this, which imo is a requirement. Chris had great intuition that adopting environments should be easy, not require changing a ton of things at once https://pantsbuild.slack.com/archives/C0D7TNJHL/p1663800423438789?thread_ts=1663790578.352079&amp;cid=C0D7TNJHL
w
between adopting environments being easy, and not being able to model valid use cases, i think we need to bias toward being able to model valid use cases. if we have another design for achieving https://pantsbuild.slack.com/archives/C0D7TNJHL/p1663801318408959?thread_ts=1663790578.352079&amp;cid=C0D7TNJHL then ok