I m looking at `RunRequest` and related infrastructure at th Pants #development

I’m looking at `RunRequest` and related infrastruc...

ancient-vegetable-10556

02/01/2022, 6:49 PM

I’m looking at

RunRequest

and related infrastructure at the moment — is there a particular blocker around refactoring it to use a

Process

object rather than specifying raw args? Currently constructing a valid JVM process is complicated, so I’d love to make use of the

JvmProcess

infrastructure I used previously. (pinging @witty-crayon-22786 in particular since he’s seen the

JvmProcess

work thus far)

witty-crayon-22786

02/01/2022, 6:56 PM

not … that i’m aware of…?

witty-crayon-22786

02/01/2022, 6:57 PM

but note that rather than a

Process

, it will likely need to be an

InteractiveProcess

👍 1

witty-crayon-22786

02/01/2022, 6:57 PM

which has slightly different constraints.

ancient-vegetable-10556

02/01/2022, 6:57 PM

You can construct an

InteractiveProcess

from a

Process

, most of the time

ancient-vegetable-10556

02/01/2022, 6:57 PM

that’s certainly what we’re doing over in

junit.py

witty-crayon-22786

02/01/2022, 6:58 PM

sure. just suggesting that it should probably be the

@rule

authors constructing the

InteractiveProcess

, to push considering those constraints to them

ancient-vegetable-10556

02/01/2022, 6:59 PM

OK, that makes sense

ancient-vegetable-10556

02/01/2022, 9:52 PM

@witty-crayon-22786 How much do we care about

run

executing things in a chroot workspace? Currently our JVM bootstrapper needs to

ln

JAVA_HOME

directory, which is done inline with running the relevant executable, but the chroot approach would probably involve some sort of two-step that we don’t currently have a good model for

witty-crayon-22786

02/01/2022, 9:58 PM

mm.

ancient-vegetable-10556

02/01/2022, 9:59 PM

I can think about the two-step (currently I’m working on using

RunRequest

, but

Process

doesn’t have a good model for this either

witty-crayon-22786

02/01/2022, 9:59 PM

um, the

ln

was mostly supposed to support providing a stable location for the JDK to be referenced from within a command. if there is another way to accomplish that, then a chroot might not be necessary.

witty-crayon-22786

02/01/2022, 10:00 PM

but… i’m not sure that you can actually avoid a chroot…? we have to materialize all of the other inputs to the process too. so there will be a temporary workdir

ancient-vegetable-10556

02/01/2022, 10:01 PM

OK. I’m thinking that we might need to add a “preparation args” to

RunRequest

so that we can run the preparation code

ancient-vegetable-10556

02/01/2022, 10:01 PM

so we run that and output the results into a digest of some description

witty-crayon-22786

02/01/2022, 10:01 PM

i’m not sure that the preparation code is actually any different from the other code being run. the big difference with

run

is mostly that the CWD may not be equal to your temporary directory

witty-crayon-22786

02/01/2022, 10:02 PM

i.e.: all the temporary stuff is in $dir1, and my cwd is $dir2

ancient-vegetable-10556

02/01/2022, 10:02 PM

Where does the

java_home

come from then?

ancient-vegetable-10556

02/01/2022, 10:02 PM

Because we have to link it from something

witty-crayon-22786

02/01/2022, 10:03 PM

where the symlink would go is the same: it’s still inside the tempdir. the challenge is probably that we don’t have a relative path to the tempdir, and thus to the location of the symlink

ancient-vegetable-10556

02/01/2022, 10:04 PM

that’s assuming we run before chrooting, right?

witty-crayon-22786

02/01/2022, 10:04 PM

so… basically, it affects all use of relative paths in process startup, i think.

☝️ 1

witty-crayon-22786

02/01/2022, 10:04 PM

i think that java_home might just be the first instance of this.

ancient-vegetable-10556

02/01/2022, 10:05 PM

I think we can get a relative location for the tempdir, it’s just doing the preparation before the chroot gets invoked that is the complication

witty-crayon-22786

02/01/2022, 10:09 PM

yea. it’s interesting that none of python, go, or docker needed this

witty-crayon-22786

02/01/2022, 10:09 PM

(but maybe not surprising… because we end up using self-contained binaries for them)

witty-crayon-22786

02/01/2022, 10:09 PM

@ancient-vegetable-10556: could we craft a shell/python script to be the actual executable for

run

witty-crayon-22786

02/01/2022, 10:10 PM

and then that script could prepare the arguments…?

witty-crayon-22786

02/01/2022, 10:11 PM

i suppose that the issue is that if a

Process

has been written using relative paths, absolutizing them is challenging without its support

ancient-vegetable-10556

02/01/2022, 11:37 PM

The executable for run would be a shell script, but we’d still need to yoink the JVM into the chroot hierarchy before entering the chroot

ancient-vegetable-10556

02/01/2022, 11:37 PM

unless I’m seriously misunderstanding something

witty-crayon-22786

02/01/2022, 11:38 PM

https://pantsbuild.slack.com/archives/C0D7TNJHL/p1643753057028229?thread_ts=1643741357.281769&cid=C0D7TNJHL

witty-crayon-22786

02/01/2022, 11:39 PM

yes, the location of JAVA_HOME is affected. but so are all other relative paths, unfortunately. i think that basically everything needs to be absolutized, but how isn’t clear.

ancient-vegetable-10556

02/01/2022, 11:40 PM

It’s a relative path to something that lives somewhere completely outside of the hierarchy, in this case, right?

witty-crayon-22786

02/01/2022, 11:40 PM

…ah. that’s why: https://github.com/pantsbuild/pants/blob/31c72ce6133f525ad8853ecd7f142fbc6fb76ea9/src/python/pants/core/goals/run.py#L52-L55

witty-crayon-22786

02/01/2022, 11:42 PM

so basically, to work as a RunRequest, a Process needs all relative paths absolutized by the template variable…

java_home/bin

becomes

{chroot}/java_home/bin

, etc.

ancient-vegetable-10556

02/01/2022, 11:44 PM

Right, but at the moment, we get the java home by asking Coursier where that java home is

witty-crayon-22786

02/01/2022, 11:44 PM

@ancient-vegetable-10556: that doesn’t need to change. all that needs to change is that the location that we symlink it to needs to be made absolute via the

{chroot}

template variable

witty-crayon-22786

02/01/2022, 11:44 PM

(afaict)

witty-crayon-22786

02/01/2022, 11:45 PM

can take a look at

src/python/pants/backend/python/goals/run_pex_binary.py

for comparison… it basically prepends

{chroot}

to everything relevant

ancient-vegetable-10556

02/01/2022, 11:45 PM

Right, but we have to run that before we enter the chroot, right? Because the jdk isn’t inside the chrooted directory structure. Or are we linking in the relevant external binaries some other way here?

witty-crayon-22786

02/01/2022, 11:46 PM

@ancient-vegetable-10556: no

witty-crayon-22786

02/01/2022, 11:46 PM

coursier java-home

can run anywhere, and will emit an absolute path to a JDK

ancient-vegetable-10556

02/01/2022, 11:46 PM

Does that jdk need to exist on the system already?

witty-crayon-22786

02/01/2022, 11:47 PM

no: the call will fetch it if it needs to, into a cache directory

ancient-vegetable-10556

02/01/2022, 11:47 PM

great

witty-crayon-22786

02/01/2022, 11:48 PM

i started symlinking it from the absolute location to a location inside the sandbox in order to avoid needing templating in most JDK commands… they could just expect a symlink to exist in the sandbox already

ancient-vegetable-10556

02/01/2022, 11:48 PM

ancient-vegetable-10556

02/02/2022, 12:25 AM

that all works in theory. It looks like

RunRequest

will make a thing that can run, but it downloads a JDK every time, so when I come back to this, I’ll need to figure out why that’s the case.

witty-crayon-22786

02/02/2022, 1:03 AM

that is likely because the named_caches aren’t symlinked properly, OR the env vars aren’t set to use them

ancient-vegetable-10556

02/02/2022, 4:00 PM

oh yeah, that’s definitely the case, I did the bare minimum to get it to run without rewriting

RunRequest

before I go back and make it efficient

ancient-vegetable-10556

02/02/2022, 5:40 PM

@witty-crayon-22786 So I’m getting this error as I start to wire up the jdk caches:

Copy code

ValueError: InteractiveProcess requested setup of append-only caches and also requested to run in the workspace. These options are incompatible since setting up append-only caches would modify the workspace.

Would it make sense to pre-populate a JDK into the right place and dump that into the workspace? Or should we be trying to make the cache mechanism work more generally?

witty-crayon-22786

02/02/2022, 5:43 PM

…hm. other folks (Benjy, Tom, Eric) worked with this more recently, and there are a few dimensions. but i do think that we might need to adjust the assumption there…

witty-crayon-22786

02/02/2022, 5:43 PM

the InteractiveProcess isn’t aware of the temporary directory that the run goal is creating.

witty-crayon-22786

02/02/2022, 5:43 PM

(i think…?)

ancient-vegetable-10556

02/02/2022, 5:44 PM

I’m not sure what “aware of” means in this situation? on my machine at least, I’m able to

cd

over to the temporary directory and do things there

witty-crayon-22786

02/02/2022, 5:44 PM

https://github.com/pantsbuild/pants/blob/331352e4e788cd967be0c22074a450696b0cb6f5/src/python/pants/core/goals/run.py#L136-L160

witty-crayon-22786

02/02/2022, 5:45 PM

basically, the run goal is creating a temporary directory somewhere, and then running the InteractiveProcess somewhere else

ancient-vegetable-10556

02/02/2022, 5:45 PM

I am, in fact, working in that part of code

witty-crayon-22786

02/02/2022, 5:45 PM

right. so what i mean by “aware of” is just that the run goal is creating a temporary directory, but the InteractiveProcess machinery doesn’t know about it.

witty-crayon-22786

02/02/2022, 5:46 PM

the cache symlinks (and any other setup) would need to be created in that temporary directory: not in the workspace

witty-crayon-22786

02/02/2022, 5:47 PM

it sortof seems like all of the

{chroot}

templating and temporary directory creation could move inside of InteractiveProcess running?

witty-crayon-22786

02/02/2022, 5:48 PM

because the current location is a bit of a hack

witty-crayon-22786

02/02/2022, 5:48 PM

@ancient-vegetable-10556: does that make sense?

ancient-vegetable-10556

02/02/2022, 5:49 PM

Potentially — I’ve actually avoided the

{chroot}

templating at all, by running

Copy code

cd `dirname $0`

in the root

InteractiveProcess

witty-crayon-22786

02/02/2022, 5:49 PM

mm. that will break usage i think.

witty-crayon-22786

02/02/2022, 5:50 PM

the idea behind running in the workspace is that something like:

Copy code

./pants run $target -- a/relative/arg for/my/process

…should work

witty-crayon-22786

02/02/2022, 5:50 PM

people will expect that the process is running in the current directory.

ancient-vegetable-10556

02/02/2022, 5:50 PM

makes sense

ancient-vegetable-10556

02/02/2022, 5:51 PM

I’m really trying to get this working piecemeal because the

run

machinery is delicate and I don’t yet understand it 🙂

witty-crayon-22786

02/02/2022, 5:52 PM

yea. it took a while to page in, but it’s becoming clearer to me i think.

witty-crayon-22786

02/02/2022, 5:54 PM

1. the

{chroot}

templating is necessary if relative paths will be used, afaict 2. the

{chroot}

templating and temporary directory creation moving into

InteractiveProcess

(maybe as prework?) would likely clarify all of this a lot

ancient-vegetable-10556

02/02/2022, 5:54 PM

fwiw, it’s not entirely clear to me what’s actually chrooted here (I think I can see the full filesystem inside the process?)

witty-crayon-22786

02/02/2022, 5:55 PM

“chroot” is a misnomer here. it’s just the temporary directory containing all of the inputs

witty-crayon-22786

02/02/2022, 5:55 PM

even moreso because the actual cwd of the process is something else when

run_in_workspace=True

ancient-vegetable-10556

02/02/2022, 5:55 PM

ok, that makes sense

witty-crayon-22786

02/02/2022, 5:56 PM

InteractiveProcess is implemented here: https://github.com/pantsbuild/pants/blob/331352e4e788cd967be0c22074a450696b0cb6f5/src/rust/engine/src/intrinsics.rs#L549-L736

ancient-vegetable-10556

02/02/2022, 5:56 PM

(do we actually generally do chrooting in other cases?)

witty-crayon-22786

02/02/2022, 5:56 PM

every process we run runs in a temporary directory with a limited env… but we’re not using any os-level features to prevent breaking out.

witty-crayon-22786

02/02/2022, 10:36 PM

@ancient-vegetable-10556: regarding the “setup the sandbox” portion of interactive process (linked above): if you look at it, it bears a lot of similarity to https://github.com/pantsbuild/pants/blob/331352e4e788cd967be0c22074a450696b0cb6f5/src/rust/engine/process_execution/src/local.rs#L605-L626, which is what is used to prepare the sandbox for a non-interactive process

witty-crayon-22786

02/02/2022, 10:37 PM

if we wanted to support

immutable_inputs

append_only_caches

, etc on

InteractiveProcess

, then adjusting the interactive process runner code to use

prepare_workdir

would go a long way

ancient-vegetable-10556

02/03/2022, 5:00 PM

OK. Maybe I can find some time to pair with you on this later on? I’ll adjust my focus to getting

run

to work more generally (Java first, then Scala)

witty-crayon-22786

02/03/2022, 6:38 PM

works for me

Open in Slack

Previous Next