I've been looking at cases where ctrl-c still does...
# development
h
I've been looking at cases where ctrl-c still doesn't promptly exit pants, and one of the cases where I've seen that is with pantsd tests
specifically pantsd tests run with the
--debug
flag to foreground them; e.g.
./pants test --debug tests/python/pants_test/pantsd/pantsd_integration_test.py
it's very easy for a ctrl-c to quit this test but leave the test instance of pantsd running in the background and still trying to output things to the terminal
this might be fixable by changing how our own pantsd test harness works; but we also might expect pants users to set up tests that test daemonized processes in a similar way to what we're doing
a
i have been saying we need to work on the pantsd test harness for a while and i tried very hard and did not get anywhere over a year ago and i'd love to try again
h
we could definitely do work on the pantsd test harness; but also this seems like a case that pants should be able to handle
a
yes agreed
h
I'm actually seeing now that even in non-pantsd tests,
--debug
seems to cause ctrl-c to not actually quit tests processes, which I think might be because when the main instance of pantsd that is running tests exits, it doesn't necessarily kill the tests
a
ah, that's interesting
that's kind of weird. i thought you had to do something special to make sure child processes don't also die when the parent dies?
h
I'm honestly not sure
a
i get the problem now though thanks
w
the nailgun heartbeating stuff that i added a while back is supposed to be a step in the direction of the server canceling runs when the client goes away
nails
now gives a guarantee that the client input only stays open as long as the client is connected and sending heartbeats. but we don’t do anything with that information in pants’ nailgun server yet.
@hundreds-breakfast-49010: if you’re interested in taking that on, i can add more details to https://github.com/pantsbuild/pants/issues/10057
a
i also might have time this weekend
h
yeah ultimately what I'm trying to do right now is get https://github.com/pantsbuild/pants/issues/10057 into a state where we can consider it closed
w
cool beans. i’ll add the details, and you two can decide who works on what when
h
so if that requires implementing the heartbeat stuff, then it makes sense to work on it
anyway what I'm seeing locally is that pantsd seems to be successfully killed, but then there's still a few
python3.6
processes around
/usr/bin/python3.6 ./test_runner.pex --no-header src/python/pants/engine/internals/scheduler_integration_test.py
and
/usr/bin/python3.6 -m pants --no-pantsrc --pants-workdir=/tmp/.tmpjyLQzy/.pants.d/tmp/tmp6g359434.pants.d --print-exception-stacktrace=True --pantsd --pants-config-files=[] --backend-packages=['pants.backend.python', 'internal_plugins.rules_for_testing'] list-and-die-for-testing examples/src/python/example/hello/greet
(this might vary with the specific test being run)
so it does seem like a process spawned via the
InteractiveProcess
mechanism is sticking around on the system even though
pantsd
itself has quit
w
would try to repro that in a smaller case. that’s not what i was thinking of above.
h
yeah I think this specific ctrl-c-related issue might be unrelated
w
ie, would see if running a “sleep forever” process using
./pants run
, and then ctrl+c’ing it kills the sleep forever.
👍 1
"Children don't automatically die when the parent process is killed"
a
wow
thank you
that is very surprising for some reason
h
I'm not sure if this is a claim about how python popen works or if this is a general unix processs thing I was unaware of
a
yeah exactly
h
but yeah I also thought that by default processes were tied to their parent, and you had to do something special to make that not be the case
a
i do know that if you provide the PGRP instead of the PID that it kills the process tree
and i think the PGRP is just negative of the PID
that's weird i am about to post a twitter diff but i will see if that's true in the general case
h
so a solution to this problem seems like having the pants signal handlers send a signal to any subprocesses spawned by
InteractiveProcess
(and we'd need to keep track of this somewhere)
I don't know if this is the best solution
a
well to be clear this issue is happening like when pantsd dies right? could we have something happen at pantsd teardown like once it's 100% complete and about to exit, sending SIGKILL to its PGRP?
that would make it exit with an error code but idk if anything is looking at its return code
h
hm, maybe
yeah thsi only happens when pantsd is running
./pants --no-pantsd test --debug src/python/pants/engine/internals/engine_test.py
for instance doesn't leave any test process running
a
it would mean we wouldn't have to keep track of anything else to fix the issue is all. seems like there are a lot of places where we have subprocesses (like watchman or the lmdb gc service)
h
and I'm not sure why that's the case
a
me neither
h
if
--debug
is passed, then the subprocess is executed via the rust
Command
mechanism in the engine
oh wait, actually, this might be just the currently-running pytest process capturing the ctrl-c itself
which is what's supposed to happen
👍 1
because I get a different stack trace then
👌 1
a
ok the main difference with pantsd vs non-pantsd is that ^C is recieved by the
RemotePantsRunner
so i would look to see if we're forwarding the signal correctly there
good analysis
h
if I run more than one test at a time, I think other processes might still stay alive
yeah
if I do
./pants --no-pantsd test --debug src/python/pants/engine/internals::
then if I hit ctrl-c, whatever process si currently running will die, but then the next one takes over
until I've killed every test that runs
a
is that the behavior we want?
h
and now that I think about it I think this was one of the use cases @polite-garden-50641 identified as buggy in a ticket, let me see if I can dig that up
@aloof-angle-91616 I'm not sure. I think we imagined that we'd use
--debug
to debug one test at a time. but nothing right now stops anyone from passing it with more than one test target
curious if you have any thoughts on what the right behavior here should be @witty-crayon-22786
or @hundreds-father-404
h
--debug
is interesting. With
repl
and
run
, we know we’ll only have one interactive process Until recently, it was the same with
--debug
, but we recently loosened the constraints so that you could run as many tests as you want
w
@hundreds-breakfast-49010: you changed this just recently in https://github.com/pantsbuild/pants/pull/10758, right?
h
right, that commit was about making pantsd match the behavior of no-pantsd. the right thing was happening in the no-pantsd case - at least if only one target was being run interactively
but if we have more than one target being run interactively, like inthe
--debug
case, I'm not sure what the best thing to do is
w
both the parent and the child should be killed, probably
whether that requires killing the pgroup, or forwarding the signal, unclear.
h
so to make that work, I think we would need to have pants go back to capturing the ctrl-c,never letting it get to the interactive process, and then having pants kill the entire process group
h
Counterpoint is that with
repl
, I like the non-pantsd behavior that it results in a
KeyboardInterrupt
being printed from the Repl, which is what you’d get without Pants too. Pants is a light wrapper (The pantsd behavior is different than non-pantsd for
repl
)
h
and I think these were the considerations we had in mind a while ago when we decided to add the feature that
InteractiveProcess
doesn't let pants itself handle ctrl-c (which was months ago)
h
I’m not sure what makes sense here. Generally, I bias towards “behave how it would behave without Pants” But that’s not a great guide with `--debug`; usually you’d use one single process for all your tests
h
but at that time, we weren't running pantsd by default
and the bug I fixedin https://github.com/pantsbuild/pants/pull/10758 was basically pantsd doing something different from pants with no daemon
does anyone run the
pytest
binary directly to run tests in a python repo?
I mean typically I would expect that people have some kind of harness around running that directly, whether that's pants or a shell script or Makefile or something
h
does anyone run the pytest binary directly to run tests in a python repo?
Yes, for sure. This is the default flow. Or a light shell script that runs it, yeah But that script is only running Pytest in one subprocess call, usually
a
when we use
--debug
, are we piping out in our rust code to stdout and stderr? or are we letting it inherit the parent file descriptors? i'm not sure if that's related to why ^C gets sent to the child instead of pants
i will look ^_^
h
oh hm
this code lives in
<http://interface.rs|interface.rs>
the rust std lib
Command
api has some faculties for doing things to the std* file descriptors that I don't think we're using?
a
i bet it's stdin
a
thanks
h
"Inherit stdin/stdout/stderr for
spawn
or
status
, but create pipes for
output"
so we call
.spawn
later on in the file
but anyway we did originally want Ctrl-C to be sent to an interactive process (and only an interactive process -that's one of the things that makes interactive processes different)
I wonder if maybe the right solution is actually to require that
--debug
only be used with one target
the point of
--debug
is so that you can run a test interactively, so you can debug it
a
well but i would want to be able to
./pants test --debug ::
though maybe in a small repo
h
actually right now if you do that, does each test run serially?
a
will check
h
it looks like it might be?
a
blah bootstrapping. will check the code
h
in a small repo you might actually want the behavior where each test runs serially, and you do have to hit ctrl-c several times to quit each test individually
but that gets old real quick in a repo the size ofpants'
a
Copy code
for debug_request in debug_requests:
            if debug_request.process is None:
                continue
            debug_result = interactive_runner.run(debug_request.process)
            if debug_result.exit_code != 0:
                exit_code = debug_result.exit_code
so yes they do run serially for
--debug
it looks like we could add an option to the test subsystem like
--debug --fail-fast
which would avoid the multiple ^C (and default it to False maybe)
h
or maybe that should even be the default behavior. but either way, that's saying that the way pants handles a ctrl-c with interactive processes is right, and it's specifically the behavior of the
test
goal that should change to accomodate it
I think that's reasonable though
a
yes i think so
it's nice that we don't have to specially handle ^C here in our rust code and can just let it forward
h
so actually, doing the right thing in this loop, might require the
debug_result
to have a way to distinguish an actual test failure from a ctrl-c
and I'm not sure if there's a way to do that at the
pytest
level
(and of course we shouldn't write code that assumes
pytest
- other test runners exist)
a
i feel like the
--fail-fast
behavior is a pretty reasonable flag to have
i would be surprised that getting a normal test failure doesn't stop it
h
Yes, it was important to change the restriction that
--debug
only applies to one target. This was crucial when we made the change to the model of targets, where now Pants always runs one-file-at-a-time (unless we batch). Without the change,
./pants test path/to:target
would complain if the target had >1 file
i feel like the --fail-fast behavior is a pretty reasonable flag to have
Possibly. We thought about that when loosening
--debug
, but didn’t add because you’re not expected to use
--debug
for more than when you need a debugger. So we didn’t want to optimize for it;
test
already has a ton of options
a
"has a ton of options" is confusing to me because i remember the v1 pants options page starts with "...has literally hundreds of configurable options". are we concerned that adding this option will introduce behavior that would be hard to move elsewhere? or does
./pants help
become unwieldy for so many options on one subsystem?
h
or does ./pants help become unwieldy for so many options on one subsystem?
Options fatigue. It gets overwhelming when you have too much choice. It also (often) makes it harder for plugin authors to implement things the more options we add. For example, adding
--test-force
was worth it, but it means that plugin authors now need to be careful to respect the flag
a
where does the slippery slope end? what process do we use to decide whether this is occurring?
h
Like everything most things, it’s a judgment call, based on asking for feedback from other contributors, listening to Pants users, code review, etc For example,
test --force
was added based on a user request. The other day, I thought “let’s add a `force_reruns`” field to
python_tests()
, and Stu pointed out it was a bit premature, that we don’t know how users would need something like that. So we didn’t land the change.
a
i don't think the behavior of having to press ^C for every test that fails is good and i think greg said above that he would probably think it should be the default behavior. could you provide an opinion on that?
h
I’m not sure what the right behavior should be. I could see a couple scenarios, including: - “This one test is taking forever. I want to skip it and go on to the next one. <ctrl-c> Oh…I didn’t mean to quit everything.” - “Ah, I didn’t mean to run 100 tests!!! Exit!” I’m really not sure. And possibly an option is indeed appropriate; I didn’t mean to imply that “no way should we do an option”. More, “that might make sense. Idrk. We should generally think really critically before adding more options, but it could be the right decision here.” I apologize that it might have seemed like I was trying to shut down your suggestion.
👍 1
a
i can make an issue
💯 1
your response above makes it clear you were not trying to shut down my suggestion
❤️ 1
i did not know how to phrase my first response other than slippery slope and i recognize that that's often used in a much harsher way
h
The place I do think I have a preference for is
repl
, that I really like when you’re not using Pantsd that we behave just like you would without Pants. That is, ctrl-c results in “KeyboardInterrupt” being printed to the REPL, but not exiting That would imply “We should pass the signal through to the InteractiveProcess and let it handle it”. But, then, that might be really unergonomic for
test --debug
🤔 and I don’t think we want lots of special casing in the Rust code
a
does InteractiveProcess consume Process yet? i can perhaps see an argument for adding a field there, which is maybe special casing but it should at least be within an
if
branch predicated on the field
h
and I don’t think we want lots of special casing in the Rust code
Well, possibly, it should be an option on
InteractiveProcess
like
capture_signals
? I’m a little skeptical for similar reasons of “option fatigue”, but this time applied to rule authors. The more parameters we add, the higher our surface area for bugs. But none of these options seem perfect. So, could be worth considering
☝️ 1
✍️ 1
coke
a
+1
and also if i understand you correctly,
--debug
is especially useful to attach a debugger (and to have stdin and a tty attached, basically): could that be its own goal? feels not too hard because v2 rules are so easy to mix and match
h
Thank you for offering to write an issue. I think that’d help It might be helpful to write out the requirements that we do know, followed by the requirements we do not know Some reqs we do know: *
pantsd
and
--no-pantsd
should be consistent *
repl
and
run
should likely not capture signals; the underlying process should handle them. Note that there’s only one InteractiveProcess in the whole run for those
👖 2
a
this is fantastic
❤️ 1
h
and also if i understand you correctly, --debug is especially useful to attach a debugger (and to have stdin and a tty attached, basically):
Alex considered that when adding the option a long time ago. Same with Tansy when adding coverage, to have a
coverage
goal. I think a couple people weighed in and all agreed that the option was nice to a) reduce the number of goals that users need to learn, and b) keep things more organized for plugin authors. For example, when you implement
test
, you must implement
test --debug
right now; it would be confusing if you could do one but not the other
some unknown reqs: * What should
test --debug
do? If we let the underlying process handle signals, you would have to press
ctrl-c
for every single test, given that there may be multiple tests in a run. Might an option be appropriate, or options fatigue and we should choose?
If we do have an option, I think that the argument on
InteractiveProcess
would be helpful. We would do something like
capture_signals=test_subsystem.exit_early
a
For example, when you implement test, you must implement test --debug right now;
something's bugging me, i feel like there's a way to handle this nicely, thinking about it
We would do something like capture_signals=test_subsystem.exit_early
capture_signals
is very interesting. i think that the shell terminal captures it and sends the signal, but i know for example if you run
emacs -nw
and press
C-c
(that's ctrl-c), it obviously captures it normally. that's a curses app, so i don't know at what level that translation occurs, or if we can avoid it being converted into a signal and then handle keyboard codes for the tty or whatever it's called
👀 1
h
something’s bugging me, i feel like there’s a way to handle this nicely, thinking about it
See https://github.com/pantsbuild/example-plugin/blob/main/pants-plugins/examples/bash/shunit2_test_runner.py for how a plugin author implements both.
🙏 1
a
probably not super important because it works as is i think but not sure
h
I wanted to finish https://www.pantsbuild.org/docs/plugins-test-goal today too, but went into a Pex and MyPy vortex
👀 1
a
will use the bathroom then i'll review the PR you just posted
h
thanks for your suggestions on this topic
my original goal in digging into this was to be able to close https://github.com/pantsbuild/pants/issues/10057 . it sounds like what we actually have at hand is a design decision about how specifically
InteractiveProcess
ought to handle signals
- “Ah, I didn’t mean to run 100 tests!!! Exit!” was basically my own reaction when I was first looking into the
test --debug
case, possibly becuase I was operating under the mindset that this was a bug in signal handling, rather than expect, if (possibly) undesirable, behavior