Hey < witty crayon 22786> I ve been thinking a lot about env Pants #development

Hey <@U06A03HV1>, I've been thinking a lot about e...

hundreds-father-404

09/21/2022, 8:02 PM

Hey @witty-crayon-22786, I've been thinking a lot about environment matchers and want to make sure I understand the motivating use cases:

hundreds-father-404

09/21/2022, 8:03 PM

I thought of two situations where you want "fallbacks" aka "fuzzy matching", beyond what we already have with `__local__`: 1. remote execution gets disabled via the global option. You may want to use an equivalent Docker image locally, or be fine using local 2. when using Docker locally, Jacob Floyd has explained he wants to use local if possible (i.e. on Linux) Am I missing anything? That covers the user stories I came up with at the start of the project, and your motivation in the appendix:

In order to transparently support remote execution where possible (while falling back to other environments if necessary)

witty-crayon-22786

09/21/2022, 8:05 PM

yea, that sounds right. essentially, allowing tests/packages to be lenient about where they run/are-built.

👍 1

witty-crayon-22786

09/21/2022, 8:06 PM

allowing / preferring

👍 1

hundreds-father-404

09/21/2022, 9:24 PM

Given the above that we only have two known cases for environments wanting fallbacks (remote execution disabled, and docker preferring local if possible), design for how to do this:

hundreds-father-404

09/21/2022, 9:24 PM

the key insight for me was that ~95% of use cases w/ environments is consuming them by setting

environment

. Much less common is defining the environment. So, optimize for that 95% use case being simple Rather than inventing a new DSL for the

environment

field, like

platform:linux_x86_64

and

image:centos6

, add this to the environment target definitions:

Copy code

docker_environment(prefer_local_environment=True)
remote_environment(fallback="centos6")
remote_environment(fallback="__local__")  (the default)
remote_environment(fallback=None)  (i.e. error)

Then consuming targets still set

environment="centos6"

environment="remote"

, while getting the benefits of fallbacks

hundreds-father-404

09/21/2022, 9:25 PM

This gives less flexibility for consumers to choose different fallback behaviors, but a) I doubt that is a common requirement, and b) it is still possible by creating 2+

docker_environment

remote_environment

targets with different fallback behaviors. (They can use Python syntax in BUILD files or defaults for DRY)

hundreds-father-404

09/21/2022, 9:25 PM

A small but neat benefit of this approach rather than the

environment

field DSL is that it allows you to express "prefer RE, then fallback to local if a compatible platform, else Docker" (thanks to

prefer_local_environment

)

hundreds-father-404

09/21/2022, 9:25 PM

cc @ancient-vegetable-10556 @happy-kitchen-89482 I think fortunately this design would be very easy to implement

👍 1

hundreds-father-404

09/21/2022, 9:32 PM

ohhh and I think this design is very future proof with speculation! You can enable speculation by changing the

remote_environment

targets, maybe w/ a global default via options -- no need to update call sites

witty-crayon-22786

09/21/2022, 9:56 PM

thanks!

A small but neat benefit of this approach rather than the
environment
field DSL is that it allows you to express “prefer RE, then fallback to local if a compatible platform, else Docker” (thanks to
prefer_local_environment
)

that would be equivalent to defining three environments, and then matching them in order, i think?

witty-crayon-22786

09/21/2022, 9:57 PM

basically: i think that this would result in a linked list of environments, where if there was order to their definitions, it would directly be a list (in the declared order)

hundreds-father-404

09/21/2022, 9:57 PM

I think fortunately this design would be very easy to implement

Woot, took a whole of 15 minutes to implement remote environment 🚀 https://github.com/pantsbuild/pants/pull/16955 docker should be easy to add before logging off tonight, if we like the design

witty-crayon-22786

09/21/2022, 9:58 PM

but speculation actually maybe makes it look more like a tree than a linked list? so maybe that’s an advantage?

witty-crayon-22786

09/21/2022, 9:59 PM

but speculation actually maybe makes it look more like a tree than a linked list? so maybe that’s an advantage?

well… no, maybe not. you don’t need a tree… rather, two lists of environments to speculate with.

hundreds-father-404

09/21/2022, 9:59 PM

that would be equivalent to defining three environments, and then matching them in order, i think?

You need a syntax to express "only Docker if you can't use local". So I don't think ordering that list as RE -> Local -> Docker would work, because local always is a match

witty-crayon-22786

09/21/2022, 10:00 PM

You need a syntax to express “only Docker if you can’t use local”. So I don’t think ordering that list as RE -> Local -> Docker would work, because local always is a match

right: the syntax i had suggested in the Appendix was something like

platform: linux

witty-crayon-22786

09/21/2022, 10:03 PM

yea, i think that this will work, in the sense that you still end up with a list that you match in order. it seems like it might allow for more different lists to be encoded though (which is probably good?), because each list head that you start walking from might send you somewhere different

hundreds-father-404

09/21/2022, 10:04 PM

@witty-crayon-22786 what's the advantage of having each consumer use this new DSL, vs the approach in https://github.com/pantsbuild/pants/pull/16955?

hundreds-father-404

09/21/2022, 10:05 PM

i think that this will work

what does "this" mean here?

witty-crayon-22786

09/21/2022, 10:05 PM

what does “this” mean here?

your design

hundreds-father-404

09/21/2022, 10:07 PM

ah, cool 😄

because each list head that you start walking from might send you somewhere different

I was thinking only adding

fallback_environment

remote_environment

. so you could go RE -> RE -> RE -> Docker -> Local, theoretically, but that seems weird For Docker, it doesn't seem necessary to fallback to another Docker env? And falling back to an RE env seems weird, that you'd prefer RE. Hence why I was thinking Keep It Simple and have

prefer_local_environment: bool

witty-crayon-22786

09/21/2022, 10:08 PM

@witty-crayon-22786 what’s the advantage of having each consumer use this new DSL, vs the approach in https://github.com/pantsbuild/pants/pull/16955?

there may not be one. i think that the impact of your design is that the environment

name

becomes even more central, because it essentially becomes a bit like a strategy name: your

centos6

environment might actually be defined as “remote centos6 if available, then local iff centos6, then centos6 docker image”

💯 1

witty-crayon-22786

09/21/2022, 10:09 PM

For Docker, it doesn’t seem necessary to fallback to another Docker env? And falling back to an RE env seems weird, that you’d prefer RE. Hence why I was thinking Keep It Simple and have
prefer_local_environment: bool

where does that field go?

witty-crayon-22786

09/21/2022, 10:10 PM

(i do think that having to pre-decide which fallbacks are “legal/reasonable” ahead of time is vaguely a risk… but at the extreme, all environments might end up with some sort of fallback eventually if we find out about enough use-cases. so not a blocker)

👍 1

hundreds-father-404

09/21/2022, 10:12 PM

cool, I'm thinking the same thing 🙂

where does that field go?

ah,

docker_environment(prefer_local_environment: bool)

, which means that if the local Platform ==

docker_environment(platform)

field, then use

__local__

then local iff centos6

This part won't be as powerful because I don't think we have enough information from a

docker_environment

target to match things like OS version. We only have the

DockerPlatformField

Platform.create_for_localhost()

to go off of We could add more fields to

docker_environment

to make the matching more precise, if we found there was a need. Straw design,

prefer_local_if_is_os=("centos", "6.*")

witty-crayon-22786

09/21/2022, 10:14 PM

We could add more fields to
docker_environment
to make the matching more precise, if we found there was a need. Straw design,
prefer_local_if_is_os=("centos", "6.*")

that gets into the matching syntax… but matching inside the targets and giving them names might still be better from a user interface perspective of

environment=…

being simple.

👍 1

hundreds-father-404

09/21/2022, 10:15 PM

Agreed. I'm thinking we start with only matching on Platform for

prefer_local_environment

for Docker, which meets the use case Jacob gave us. It defaults to False, and our

help

will warn it solely matches by platform I'm happy to name that field something conservative to mention the platform, so we can reserve more powerful matching in the future

witty-crayon-22786

09/21/2022, 10:16 PM

one thing with

prefer_local

is that’s it’s a bit odd to have the edge defined in the other direction:

remote_environment

defines an outbound edge, this is an inbound edge

witty-crayon-22786

09/21/2022, 10:17 PM

if your platform matching syntax ended up being powerful enough, then you could put it on

local_environment

, and have that fall back to docker if need be?

witty-crayon-22786

09/21/2022, 10:18 PM

i.e., start by matching “is centos6” against the local environment, then fall back to docker if not

hundreds-father-404

09/21/2022, 10:21 PM

That's interesting I was about to say I think we want to consider what makes sense for consuming targets to set in BUILD files, but I realized that's pretty irrelevant. Consumers only need to know about the

name

from

[environments].preview

. It doesn't really matter how the name

centos6_or_local_linux

is defined, that's abstracted from consumers What matters more is gently encouraging the environment authors to use good names

hundreds-father-404

09/21/2022, 10:25 PM

one thing with prefer_local is that’s it’s a bit odd to have the edge defined in the other direction: remote_environment defines an outbound edge, this is an inbound edge

weird in what way? my head story is

using Docker is a niche feature, whereas local environment is the baseline. Have the niche config live on the Docker target

I'm not following also how it's different than

remote_environment(fallback_environment)

? It's the same general idea of "Maybe use this env, but fallback to X if possible or if required"

witty-crayon-22786

09/21/2022, 10:26 PM

It’s the same general idea of “Maybe use this env, but fallback to X if possible or if required”

it’s the inverse: it’s “use that environment, but fall back to this one if not possible”

👍 1

witty-crayon-22786

09/21/2022, 10:27 PM

docker might be niche, but having the edges flowing in different directions would seem to make things more challenging to explain

hundreds-father-404

09/21/2022, 10:31 PM

how so?

witty-crayon-22786

09/21/2022, 10:43 PM

@hundreds-father-404: well, you got confused by what i meant by inbound/outbound edges… and i got confused by the design initially (probably at least in part due to this)

witty-crayon-22786

09/21/2022, 10:43 PM

it’s just inconsistent i think.

witty-crayon-22786

09/21/2022, 10:45 PM

more generally: i think that for the

local_environment

to grow more useful over time, the matching syntax you described (

prefer_local_if_is_os

, etc) would also need to be on the

local_environment

: i.e.: use this local environment if centos6, else that local environment

hundreds-father-404

09/21/2022, 10:46 PM

i think that for the local_environment to grow more useful over time, the matching syntax you described (prefer_local_if_is_os, etc) would also need to be on the local_environment

agreed. I think we could probably avoid that

prefer_local_if_is_os

as scope creep until a user needs it. but we should make sure the design can accomdate it

witty-crayon-22786

09/21/2022, 10:46 PM

er: to be clear: i was saying that that is a motivation to put the definition of the match on the

local_environment

, not on the

docker_environment

hundreds-father-404

09/21/2022, 10:47 PM

oh btw, semi related update: I was planning on no longer erroring if a

local_environment

isn't defined for your current platform. That means just use the subsystem options as a default otherwise, if you define 1 local env for e.g. M1s, we'd be forcing you to define for everything else. which may not be necessary

hundreds-father-404

09/21/2022, 10:48 PM

er: to be clear: i was saying that that is a motivation to put the definition of the match on the local_environment, not on the docker_environment

ah, yeah, we'd need to replace

compatible_platforms

local_environment

with something more powerful

witty-crayon-22786

09/21/2022, 10:53 PM

which relates a bit to the fact that the

___local___

environment is not currently necessarily the beginning of a chain: we start by matching one via the platform

witty-crayon-22786

09/21/2022, 10:56 PM

So if you had two local Linuxes and wanted to prefer one to the other, we'd need a different resolution for

__local__

hundreds-father-404

09/21/2022, 10:56 PM

so, with the current simplistic Platform matching, this would look like?

Copy code

local_environment(
   compatible_platforms=["linux_x86_x64"],
   fallback_if_not_valid="centos6",
)

Copy code

python_tests(environment="linux_or_centos6_docker")

hundreds-father-404

09/21/2022, 10:57 PM

So if you had two local Linuxes and wanted to prefer one to the other, we'd need a different resolution for local

What does that mean?

witty-crayon-22786

09/21/2022, 10:57 PM

(or maybe we get rid of

___local___

, and it's always an explicit name?

hundreds-father-404

09/21/2022, 10:58 PM

(or maybe we get rid of local, and it's always an explicit name?

I think we need

__local__

to handle the user story

macOS vs Linux Python interpreter config

https://docs.google.com/document/d/1vXRHWK7ZjAIp2BxWLwRYm1QOKDeXx02ONQWvXDloxkg/edit#heading=h.3mi2qi3bz335

witty-crayon-22786

09/21/2022, 10:58 PM

> So if you had two local Linuxes and wanted to prefer one to the other, we’d need a different resolution for local

What does that mean?

if you define a

local_environment

that matches centos6, and another

local_environment

for if not centos6, then currently things would explode

witty-crayon-22786

09/21/2022, 10:58 PM

I think we need
__local__
to handle the user story
macOS vs Linux Python interpreter config

that could be handled by chaining from an explicit first node

hundreds-father-404

09/21/2022, 10:59 PM

Okay I'm having a hard time following this tbh. It might make more sense to me tomorrow morning, but would you mind maybe sketching it out with arrows, like

local Linux x86 -> Docker centos6

witty-crayon-22786

09/21/2022, 11:01 PM

strawsyntax but:

Copy code

local_environment(name="default", platform="macOS", fallback_to="linux", ..)
local_environment(name="linux", platform="linux", ..)

…and then your default environment would be

default

👍 1

witty-crayon-22786

09/21/2022, 11:01 PM

It might make more sense to me tomorrow morning, but would you mind maybe sketching it out with arrows, like
local Linux x86 -> Docker centos6
?

Copy code

local_environment(name="default", platform="linux", more_specifically="centos6", fallback_to="any_linux")
local_environment(name="any_linux", platform="linux")

hundreds-father-404

09/21/2022, 11:02 PM

I think that design would violate this, which imo is a requirement. Chris had great intuition that adopting environments should be easy, not require changing a ton of things at once https://pantsbuild.slack.com/archives/C0D7TNJHL/p1663800423438789?thread_ts=1663790578.352079&cid=C0D7TNJHL

witty-crayon-22786

09/21/2022, 11:04 PM

between adopting environments being easy, and not being able to model valid use cases, i think we need to bias toward being able to model valid use cases. if we have another design for achieving https://pantsbuild.slack.com/archives/C0D7TNJHL/p1663801318408959?thread_ts=1663790578.352079&cid=C0D7TNJHL then ok

Open in Slack

Previous Next