<@UBV8BCQPJ>, <@U6YPB4SJX>: interested in your inp...
# development
w
@red-balloon-89377, @average-vr-56795: interested in your input on https://docs.google.com/document/d/1Hn73YlhTPROlULTMa_3A-Fdv7hAPiHFII8rvXri5l7E/edit#bookmark=id.swk5himqweo4 if you have time to give it tomorrow
@hundreds-breakfast-49010: i'll review 30mins to an hour, sorry ... meetings.
including enough context there for them or for @early-needle-54791 to come up to speed on it would be helpful though
h
no worries šŸ™‚
yeah I'll add some clarifying context
w
i'll look tonight.
you should have edit access now, so can reload and switch to Edit mode if you want
i added a biiiig comment expanding on the "environment variable" approach to portability... i think it might work, but let me know what you think.
h
I'm looking at everyone's comments now
@witty-crayon-22786 I think your idea about a new "exclusivity" param to
ExecuteProcessRequest
is reasonable
although I'm not sure we gain anything out of allowing a process to run locally but in a tmpdir, as opposed to running locally in the workspace
as far as running user binaries in the workspace - would this involve materializing the pex to
dist
, and then running it from there, just as if we had run
python binary
and then invoked
$ ./dist/<binary_name>
?
w
although I'm not sure we gain anything out of allowing a process to run locally but in a tmpdir, as opposed to running locally in the workspace
almost all things you will execute will have "n or more" files that they need somewhere.
pexes are self contained... others won't necessarily be. JVM bundles, for example.
as far as running user binaries in the workspace - would this involve materializing the pex to
dist
, and then running it from there, just as if we had run
python binary
and then invoked
$ ./dist/<binary_name>
?
um, for some designs, yea. but not for the "exclusivity" approach.
the idea behind the tmp dir is that you don't actually put the process inputs from the Snapshot in the workspace.
h
right now the pex we make lives inside the
digest
on a
Pex
object, that's the change I got merged at the beginning of the week
so no matter what, if we want to run that pex we have to materialize that
digest
somewhere
the current code for ExecuteProcessRequest is underlyingly materializing the contents of a digest into a tmpdir, and then running the arguments listed in the
argv
parameter, and this is managed by the v2 engine in ways I haven't dug into deeply
so if we want to have the semantics that
pants run
runs a file in the workspace, we need to either materialize the pex there (or in general whatever the executable unit is for any language), or we need to materialize it into a tmpdir and then change the working directory, right?
python
subprocess
gives us the tools to do this, so if we wanted to we could bypass the engine
@red-balloon-89377ā€™s concern was that this will block and it will be awkward to pass control of the tty from rust to python
but I think those are solveable problems, and blocking is desirable in the case of a user program
ah, your suggestion was that we materialize the build artifacts to a tmpdir, but then give the full path to that tmpdir in the argv, so that the process's current directory is still the workspace
python subprocess definitely lets us set the cwd arbitrarily: https://docs.python.org/3.6/library/subprocess.html?highlight=subprocess#module-subprocess
w
right
i would say that "the ability to grab the console" is less important that "the ability to run in the buildroot" right now
h
it sounds like, if we decided to run this via the v2 engine with the ExecuteProcessRequest abstraction, we'd have to implement whatever python subprocess does to change the current working directory ourselves
w
~all process execution APIs let you specify the cwd
so, in effect, adding support for the "$HERE" approach to the v2 engine would mean 1) defining/using variable in a few places, 2) setting the cwd to something other than the tmpdir when you invoke the process, 3) adding the option to EPR
h
yeah that makes sense to me
do you know offhand where in the v2 engine the actual process invocation happens?
I assume this uses some rust std lib subprocess API that in principle lets us set the cwd, and we're just not doing that yet
w
... and sorry, i missed the second half of:
so if we want to have the semantics that
pants run
runs a file in the workspace, we need to either materialize the pex there (or in general whatever the executable unit is for any language), or we need to materialize it into a tmpdir and then change the working directory, right?
that's correct.
intentionally, yes. this part of what we mean by "hermetic". it's supposed to be challenging by default to depend on things that you haven't declared deps on.
src/rust/engine/process_execution/src/local.rs
h
sure, and that's maybe another argument for using a type other than
ExecuteProcessRequest
for this
because just adding a
cwd
option to EPR would make this available to any user of EPR, in principle
the other hting I"m not sure about is how to implement: "Cause any concurrent @rule execution to be serialized - or synchronize on exclusive Console access (including anything using Logger)."
if we just have a
yield Get(ExecuteProcessResult, ExecuteProcessRequest(exclusivity=CONSOLE_AND_WORKSPACE))
(or how to avoid caching this)
w
uh
so the suggestion wasn't to add
cwd
. it was to add an option like
exclusivity
about caching: ... yes. that does actually related to John's point on the Workspace ticket.
h
so if exclusivity is the hook that affects the cwd in
<http://local.rs|local.rs>
, that's probably not a problem then, since it doesn't give rule-writers the ability to change the cwd arbitrarily
w
right
h
the comment about the 'cacheable' bit?
w
i think that you could punt on touching caching in the first PR?
yea.
h
yeah in the workspace PR I punted on touching caching
in general, if we have a type like
Workspace
that is only available in a
@console_rule
, then we can guarantee that any functionality that type provides for you isn't cached
(I know that we don't acutally enforce this yet)
it makes sense to me that if we have some functionality we don't want to ever cache - and running a user's code is this sort of functionality - then we can always implement it as a method called on a type that's only available in
@console_rule
and that would imply abandoning the modified-EPR strategy and going with the
LocalRunner
strategy I suggested in the doc
this is maybe a broader conversation about exactly what guarantees we do and do not want rules to be able to encode for us
w
it might. but i'm hesitant about giving up the ability to effectively "toggle" running something in the foreground. i think that's super powerful.
for the debug case.
for example: how would you refactor our currently
pytest
rules to allow for debugging in a repl?
with foreground/exclusivity: bam.
but yea. all conversations about caching and sideeffects should be very considered, heh šŸ˜ƒ
h
I'm also under a bit of deadline pressure for the v2 version of this rule šŸ™‚
if we went with foreground/exclusivity on EPR, then we'd probably need to add the rule graph check @enough-analyst-54434 was talking bout
I dunno how hard that is to implement
w
understood. as mentioned above, you could punt on considering caching in the first iteration.
h
legit, I'll probably punt and make an issue to come back to that question hten
@happy-kitchen-89482 brought up the idea that maybe the right behavior is for pants to just exit on run, which I think we could do with
os.execvp
although there are maybe complications around Console-handing if we go that route
w
@happy-kitchen-89482: See the discussion around
foreground
here and in the doc
For a few different kinds of processes (mostly tests), you want to be able to toggle running in the foreground