https://pantsbuild.org/ logo
h

hundreds-breakfast-49010

09/23/2020, 12:17 AM
even with the above PR, if you run
./pants repl
, then hit ctrl-c, some kind of python process will remain around and screwing up the terminal
specifically it's
/home/gregs/code/pants/build-support/virtualenvs/Linux/pants_dev_deps.py38.venv/bin/python3 /home/gregs/code/pants/.pants.d/tmp0rx6pokk/requirements.pex
, which is I think the actual process that runs when pants creates a python repl. I'm not sure why this is getting killed when the pytest process is getting killed (with the above commit)
anyway like I said earlier I am getting rather frustrated with dealing with the signal handling code between the no-pantsd and pantsd cases, and I would like to do what I can to get rid of that code. I don't think https://github.com/pantsbuild/pants/issues/10061 is necessarily related to this
that is to say, if I understand correctly, porting the nailgun client to rust is orthogonal to the problem of replacing signal communication with the nailgun protocol heartbeat system that https://github.com/pantsbuild/pants/pull/10320 mentions
👍 1
w

witty-crayon-22786

09/23/2020, 12:31 AM
correct
sorry, i think i said i was going to sketch out the heartbeat stuff a bit better on the ticket.
will follow up to do that.
h

hundreds-breakfast-49010

09/23/2020, 12:33 AM
awesome
❤️ 1
yeah I think implementing that heartbeat system would do a lot to make reasoning about how signaling happens easier
❤️ 1
h

hundreds-father-404

09/23/2020, 12:45 AM
That sounds like a great investment. Thanks for all your improvements on this, Greg. And to @aloof-angle-91616 for the original implementation taking us so far! This is tricky stuff
w

witty-crayon-22786

09/23/2020, 12:46 AM
maybe. the way i would suggest thinking about it is: a signal arriving at the server currently sets a signal to say that any ongoing work should exit… in the future, a heartbeat arriving at the server would set a per-session signal that everything in that session should exit
signals are definitely fidgety and limited though
a

aloof-angle-91616

09/23/2020, 12:47 AM
i definitely think that sending a SIGKILL to the PGRP of the pants process (so, negative of its own PID) right before it exits would solve some of these issues with leftover processes without a lot of engineering effort
👀 1
w

witty-crayon-22786

09/23/2020, 12:52 AM
h

hundreds-breakfast-49010

09/23/2020, 12:53 AM
@aloof-angle-91616 at some point today I tried sending signals to the PGRP and it had unexpected bad behavior and I forget exactly what that behavior was
w

witty-crayon-22786

09/23/2020, 12:53 AM
… iff the right behavior is always to kill the whole tree, which i think it might not be in the case of
run
and
repl
h

hundreds-breakfast-49010

09/23/2020, 12:53 AM
so I abandoned that plan
a

aloof-angle-91616

09/23/2020, 12:53 AM
oh cool i’ll try it out i haven’t yet
h

hundreds-breakfast-49010

09/23/2020, 12:54 AM
I think what might've happened is that sending a signal to the PGRP kills everything including the pantsd process itself, and that meant that there was cleanup that was supposed to happen but didn't
👍 1
a

aloof-angle-91616

09/23/2020, 12:55 AM
oh yes it would have to be at the very very end
h

hundreds-breakfast-49010

09/23/2020, 12:55 AM
but I didn't investigate it much, I just moved on and used the
children
thing
psutil
gives you
a

aloof-angle-91616

09/23/2020, 12:56 AM
the pgrp seems less prone to race conditions is all
w

witty-crayon-22786

09/23/2020, 12:58 AM
it sounds like the challenge is that you don’t want to kill pantsd in the repl case. that’s going to be challenging even/especially with the heartbeat strategy.
👍 1
a

aloof-angle-91616

09/23/2020, 12:59 AM
yes that would be much more difficult
h

hundreds-breakfast-49010

09/23/2020, 12:59 AM
I think we're killing pantsd more than we need to. ideally you should be able to kill the client and have pantsd continue to run without a problem, right?
w

witty-crayon-22786

09/23/2020, 1:00 AM
that’s going to be challenging even/especially with the heartbeat strategy.
to clarify: it’s easy to not kill pantsd with the heartbeat strategy. but it’s challenging for the server to tear down processes cleanly
@hundreds-breakfast-49010: no
that is one of the key reasons for switching to heartbeats.
a

aloof-angle-91616

09/23/2020, 1:00 AM
i will review the heartbeat strategy
w

witty-crayon-22786

09/23/2020, 1:01 AM
@hundreds-breakfast-49010: to be clear, with heartbeats
pantsd
should stay up, but the run associated with the client should definitely die
let me dump the explanation right now.
👍 2
a

aloof-angle-91616

09/23/2020, 1:04 AM
it was perhaps easier to conceptualize when the daemon forked to form the pantsd-runner
w

witty-crayon-22786

09/23/2020, 1:06 AM
yea.
i mean, it’s still easy to conceptualize (i think) if you’re ok with killing everything, which was the strategy until recently
a

aloof-angle-91616

09/23/2020, 1:08 AM
yes which would be the only justification for the PGRP-upon-death
w

witty-crayon-22786

09/23/2020, 1:12 AM
ok, posted the sketch of the heartbeat stuff on https://github.com/pantsbuild/pants/issues/10057, but typing that up actually made me think of something else, so maybe hold off on even reading it:
@hundreds-breakfast-49010: posting that reminded me of how we handled cancelation of nodes, which was to add a drop guard. it feels like having a drop guard for the InteractiveProcess that kills the child might help… or perhaps only be necessary in the future to implement the heartbeat strategy
👍 2
h

hundreds-breakfast-49010

09/23/2020, 1:15 AM
I'll keep that strategy in mind
w

witty-crayon-22786

09/23/2020, 1:15 AM
i guess that IP is not currently async though… would have to make it async to allow for interrupting via drop
a

aloof-angle-91616

09/23/2020, 1:16 AM
that would be pretty neat
left a comment on possible extensions to the heartbeat strategy by reading the raw tty and/or opening up a pty if the simpler version still has issues with
InteractiveProcess
and
^C
. i hope we don't have to use either of them https://github.com/pantsbuild/pants/issues/10057#issuecomment-698845410