Only outstanding TODO on the Python `run` <https github com Pants #development

Only outstanding TODO on the Python `run` <PR> is ...

bitter-ability-32190

06/17/2022, 4:18 PM

Only outstanding TODO on the Python

run

PR is how to switch between sandbox mode. The friendliest solution IMO is a new

run

flag, however this flag is only relevant when `run`ning a "scripting" language ("compiled" languages wouldn't benefit). So pros/cons. The alternative is a field on

python_source

and

python_test

targets, which is a bit funkier and has no obvious CLI support (we could add a "default" CLI option to

python

, but now we're talking even more machinery) Thoughts y'all?

👀 1

hundreds-father-404

06/17/2022, 4:19 PM

cc @sparse-lifeguard-95737 (we were just DMing about the importance of the sandbox mode option for them)

✅ 1

bitter-ability-32190

06/17/2022, 4:20 PM

I'm pretty sure the default value will (eventually) be the "run in repo" value, as it follows the element of least surprise. But again, sad to have a flag that is irrelevant for a

docker_image

go_binary

. 😞

bitter-ability-32190

06/17/2022, 4:22 PM

Another thing to consider is we might want a field as well? Some scripts might not work well in-repo? The likelihood of this should be rare though, IMO and can usually be made to work in-repo with a bit of tweaking

sparse-lifeguard-95737

06/17/2022, 4:24 PM

As a user it would make sense to me if “run” on a pex target was always sandboxed and “run” on a python file was always in-repo

👆 1

sparse-lifeguard-95737

06/17/2022, 4:25 PM

But I have put much less thought into it than you all 😄 I just want to stop explaining why the Django management CLI doesn't Just Work in pants to my wider eng team lol

🙌 1

✅ 2

bitter-ability-32190

06/17/2022, 4:26 PM

FWIW

run

on a PEX target will be sandboxed, but that really won't matter anymore in the long run, as we'll be running the built PEX. We'll still set

CWD

to repo root as well

bitter-ability-32190

06/17/2022, 4:27 PM

FWIW A naive person's opinion matters significantly. There are few people who understand the nuance, and so Pants should do the obvious thing, as that what the majority will experience 😉

➕ 1

💯 1

witty-crayon-22786

06/17/2022, 4:39 PM

I’m pretty sure the default value will (eventually) be the “run in repo” value, as it follows the element of least surprise.

because it is not sandboxed, i don’t think that it will be a good default.

bitter-ability-32190

06/17/2022, 4:41 PM

@witty-crayon-22786 you'll have to be more specific, because to me, I'd prefer very strongly we don't have this "gotcha" for our users to stumble on

witty-crayon-22786

06/17/2022, 4:41 PM

when we need to make tradeoffs between correctness and ergonomics, i think that we need to continue to bias toward correctness.

witty-crayon-22786

06/17/2022, 4:41 PM

@bitter-ability-32190: the gotcha when it comes to correctness is “hm, why is

--loop

--changed

broken?”

bitter-ability-32190

06/17/2022, 4:42 PM

Can you ELI5 why

--loop

--changed

would be broken?

witty-crayon-22786

06/17/2022, 4:42 PM

because the entire pythonpath entry is added, rather than only the detected files/dependencies

witty-crayon-22786

06/17/2022, 4:43 PM

and only the detected files will cause the run to be invalidated

bitter-ability-32190

06/17/2022, 4:44 PM

ELI3? 🙂 I still don't follow. Also how often are people using

run

standalone vs

run

with those flags? I'm guessing at least an order of magnitude in difference

witty-crayon-22786

06/17/2022, 4:46 PM

the “run without sandbox” flag adds `PYTHONPATH=some/dir:…`… inside of

some/dir

, inference has detected that you depend on

some/dir/a.py

, but not on

some/dir/b.py

. when

some/dir/b.py

changes,

--loop

--changed

will not detect your script, because the file wasn’t a dep.

witty-crayon-22786

06/17/2022, 4:47 PM

basically: sandboxing works at the file level… but running without the sandbox works at the directory level.

bitter-ability-32190

06/17/2022, 4:48 PM

If I don't depend on

b.py

and I change it, why would I expect the run to be restarted?

bitter-ability-32190

06/17/2022, 4:49 PM

Also, it really seems we're making this right for the 2% chance people are using those flags, and wrong for the 98% chance they aren't 😅

witty-crayon-22786

06/17/2022, 4:50 PM

the reason dependency inference is safe is because sandboxing is enforcing that if we get dependency inference wrong and fail to detect that you need something, the run fails

witty-crayon-22786

06/17/2022, 4:51 PM

so, yes: inference would need to fail to detect

b.py

as a dep as well. perhaps easier to imagine as a resource which needed to be explicitly declared

witty-crayon-22786

06/17/2022, 4:52 PM

and wrong for the 98% chance they aren’t

--changed

is one of the two recommended modes of CI, so it’s more common than that.

bitter-ability-32190

06/17/2022, 4:53 PM

Ah excellent, yes, let's talk about resources! This is a huge win for us for our scripts to not HAVE to declare resources for those! Why declare the dependency for depencencies sake? Why would we want the user to have their

run

fail complaining about a missing file when we could make it succeed?

bitter-ability-32190

06/17/2022, 4:53 PM

--changed
is one of the two recommended modes of CI, so it’s more common than that.

--changed

with goals like

fmt

and

lint

? sure. Goals like

run

? Seems sketchy.

👍 1

hundreds-father-404

06/17/2022, 4:56 PM

I don't know what the right tradeoff here is. But

run

is a very weird goal that already violates several of our hermeticity expectations. For example, if you use a filesystem API like

open()

, you don't need to declare `file`/`resource` targets for those deps. We already do "bad" things

➕ 1

bitter-ability-32190

06/17/2022, 4:56 PM

A golden example is Django's

migrate.py

script. It isn't used in prod, it isn't a part of a test. Let's allow users to fall into the Pit of Success.

witty-crayon-22786

06/17/2022, 4:56 PM

--changed-.. --filter=… list | xargs -L1 ./pants run

is easier to imagine: “run changed deployment scripts”

hundreds-father-404

06/17/2022, 4:56 PM

--changed-.. --filter=… list | xargs -L1 ./pants run is easier to imagine: “run changed deployment scripts”

We error if you don't give exactly 1 argument to

run

➕ 1

witty-crayon-22786

06/17/2022, 4:57 PM

that’s what the -L1 does

hundreds-father-404

06/17/2022, 4:57 PM

and ignore the rest? When this came up last week w/ a user, I recommended that they save the output of

--changed-since list

and then loop over it calling

./pants run

multiple times

bitter-ability-32190

06/17/2022, 4:58 PM

Also I fail to see how

--changed

with

list

is part of a conversation about `run`'s sandbox behavior 😐

witty-crayon-22786

06/17/2022, 4:58 PM

anyway. this whole thing was off topic for this thread i think. i just think that there should be more discussion before making an unsandboxed mode the default.

bitter-ability-32190

06/17/2022, 4:59 PM

True, OP was about "should we make this a

run

flag"

witty-crayon-22786

06/17/2022, 4:59 PM

and ignore the rest?

no, it runs them sequentially, one at a time.

Also I fail to see how
--changed
with
list
is part of a conversation about `run`’s sandbox behavior

because you won’t detect that it needs to

run

… it’s the same dependencies list.

bitter-ability-32190

06/17/2022, 5:01 PM

because you won’t detect that it needs to
run

I'm really not following this 😅 You're just providing an arg to run and we run it.

hundreds-father-404

06/17/2022, 5:03 PM

"undeclared dependencies" will mean that

--changed-dependees

and

--loop

does not work properly. Even if the script executes, our understanding of your repo is incomplete, and we might not show something has been changed properly My point is that this is already the case with the

run

goal. It's already an impure goal Unclear to me what to decide here, though

bitter-ability-32190

06/17/2022, 5:07 PM

Ah I see. Yeah I don't think we should be enforcing

run

purity for `run`'s sake. If you want metadata purity,

test

your code. "We don't run your code like normal because we might not have gotten all your dependencies right, and in the event you really rely on those dependencies being right, but aren't validating it through tests, we want to ensure you know it by failing your

run

" doesn't quite strike me as something ergonomic.

bitter-ability-32190

06/17/2022, 5:09 PM

And it still isn't true if you don't use source roots and use namespace packages (like I do) because the CWD ends up in

sys.path

anyways so we still "find" the in-repo file 🤷‍♂️

hundreds-father-404

06/17/2022, 5:11 PM

If you want metadata purity, test your code.

Or, your unowned-dependencies mechanism! I generally agree that

run

is the wrong medium to be enforcing deps are ~~correct~~ exhaustive. We already don't do that We should document that though as a warning

✅ 1

bitter-ability-32190

06/17/2022, 5:12 PM

We should document that though as a warning

Yeah. Both options have thorns which need thorough documentation. IMO one options has less thorns 😛 --- Back on topic though.

run

flag or field? 🙂

bitter-ability-32190

06/20/2022, 12:47 AM

(And back off topic, just hit another datapoint for sandbox-by-default) After adding

--debug-adapter

support, it will only work if we run the user's code in-repo. More frustratingly, nothing will crash or no errors or warnings will appear (unless we implement them) after connecting the client. Just silently runs the code in the sandbox to completion (because of the path mismatch).

sparse-lifeguard-95737

06/21/2022, 2:25 AM

Checking in on this - since

2.13.0a0

was released without reverting the

run_in_sandbox

field on the

pex_binary

target, does that mean it will definitely be present in the 2.13.0 release? or there’s still a chance it’ll end up reverted?

bitter-ability-32190

06/21/2022, 10:18 AM

I think we should probably decide that. @happy-kitchen-89482? Of we could rename it to experimental so we can drop it without deprecating?

bitter-ability-32190

06/21/2022, 10:20 AM

Dan, you could do what I did in the meantime and add an in-repo plugin with this support

happy-kitchen-89482

06/21/2022, 2:57 PM

I should mention that running unsandboxed should not be very commonly necessary. Using this to escape having to declare deps properly could be misuse. The legitimate use is the Django makemigrations case - Django uses

__path__

magic to determine a write location based on the location of a loaded module. But this is not super common.

sparse-lifeguard-95737

06/21/2022, 2:57 PM

django is the reason why I want it

happy-kitchen-89482

06/21/2022, 2:58 PM

Right, and so

run_in_sandbox

solves that case for you

happy-kitchen-89482

06/21/2022, 2:58 PM

If we remove it, then we have to offer the alternative of always running unsandboxed

happy-kitchen-89482

06/21/2022, 2:58 PM

At first blush that seems maybe dangerous to me

happy-kitchen-89482

06/21/2022, 2:59 PM

OTOH I see the argument that we might want

./pants run path/to/file.py

to be equivalent to

python path/to/file.py

, where things ~just work

bitter-ability-32190

06/21/2022, 2:59 PM

But this is not super common.

It's very common for a lot of our scripts in-house 🙂 Even in Pants,

generate-docs

would now be writing in-repo. As is

generate_github_workflows

and

contributors.py

😉

bitter-ability-32190

06/21/2022, 3:01 PM

OTOH I see the argument that we might want
./pants run path/to/file.py
to be equivalent to
python path/to/file.py
, where things ~just work

So that's my proposal (in a nutshell) 😉 Remember Winston? He would be wasting time trying to figure out why running the script through Pants isn't working like when he runs it through his venv. We got off topic here, let me open a new thread

sparse-lifeguard-95737

06/21/2022, 3:02 PM

to be clear, I don’t really care either way about

run_in_sandbox

pex_binary

vs. un-sandboxed

run

directly on a python script - I can make either one work for my purposes. I’ve had people breathing down my neck about django management CLI + pants not working as expected, and was going to point at the new

run_in_sandbox

field and say “look it’s going to work soon!” when I saw chatter here about possibly reverting the addition of that field in 2.13. I don’t mind if it’s reverted, I just want to know if it’s going to be reverted so I don’t bait-and-switch my team

➕ 1

bitter-ability-32190

06/21/2022, 3:05 PM

opened new thread: https://pantsbuild.slack.com/archives/C0D7TNJHL/p1655823896988599

bitter-ability-32190

06/21/2022, 3:06 PM

FWIW Dan, you can do the "right thing" today with an in-repo plugin if that floats your boat. I can assist. We have this in our repo at work 💪

bitter-ability-32190

06/21/2022, 3:06 PM

Then when Pants catches up upstream you can upgrade-and-remove plugin and it should "just work" (TM)

sparse-lifeguard-95737

06/21/2022, 3:07 PM

I might take you up on that depending on how the other thread goes 🙂

👍 1

happy-kitchen-89482

06/21/2022, 3:21 PM

OK, so to clarify - there will definitely be a way to achieve this. We won't remove

run_in_sandbox

until we have some alternative.

bitter-ability-32190

06/22/2022, 1:11 AM

Since this thread already got hijacked to talk about the default value: https://github.com/pantsbuild/pants/pull/15849/ is out of "draft" and has the default set to "in-repo".

Open in Slack

Previous Next