Q21: If I don't set `--no-run-cleanup`, the finali...
# general
r
Q21: If I don't set
--no-run-cleanup
, the finalization process seems to interfere with my own cleanup procedures, crashing Python. Can I keep my program to handle signals transparently?
Internally my program uses custom async fork implementation combined with asyncio event loops per process, with custom signal block/unblock mechanisms.
I think there may be races upon monitoring the (nested) process exit and pants-side cleanup as both handles the interrupt signal at the same time.
I think the subprocess spawn subsystem should have an option to setpgrp
w
which version of pants is this? signal handling moves from SIGKILL to SIGINT in 2.11.x
but note:
--no-run-cleanup
is intended as a debugging tool, rather than something you should run with in general. what is the usecase here?
r
i'm using the master branch of pants, to build on Linux aarch64
i'm trying to run a service daemon using
./pants run
where the daemon is a multi-process asyncio Python program with custom signal handlers. It works well as a systemd service and a standalone foreground app.
Also, the program's startup logs are stripped (about the first ~80 lines are not displayed) when launched via
./pants run
, so I'd like to know if it has more options to control stdout/stderr of the child program.
I've tried to remove the minus sign from pgid passed to
kill()
in
src/rust/engine/process_execution/src/children.rs
so that it would deliver the signal only to the process group leader (as my program expects this) and to increase the graceful shutdown timeout from hard-coded 3 seconds to 10 seconds, but had no success.
maybe i'm looking at a wrong location
(rebuilding the rust engine takes more than 3 minutes, so it is hard to fast-iterate over changing some rust codes...)
hmmm, it seems that it catches the begin/exit points properly, but still I don't get why it misses some first logs and crashes upon termination..
the main process' log seems to get truncated
while manually calling
print()
works fine
w
(rebuilding the rust engine takes more than 3 minutes, so it is hard to fast-iterate over changing some rust codes...)
it builds by default in
release
mode, which takes a lot longer, but is faster to run. if you don’t care about the runtime performance, can use
debug
mode by setting `MODE=debug`: see https://www.pantsbuild.org/docs/contributions-rust#common-commands for more info
as to the crash on exit, if you’re able to reduce the repro at all and file a ticket, that would be very appreciated
Also, the program’s startup logs are stripped (about the first ~80 lines are not displayed) when launched via
./pants run
, so I’d like to know if it has more options to control stdout/stderr of the child program.
this sounds like a race condition in how we acquire access to stdout, but it’s fairly surprising, because that should all be synchronous. if you can file a report about this one too, that would be helpful
r
it may be... due to sending SIGINT to an empty set of processes (found with
-ldebug
)
hmm the related code in
src/python/pants/base/exception_sink.py
looks sane
according to the "extension modules" log message, the error has occurred in a Python process of the pants-side
(i think the module is
pants.base.exception_sink
as it is import that exact list of extension modules)
gdb.... shows a long stack trace
image.png
image.png
https://github.com/PyO3/pyo3/issues/1274 maybe related with pyo3...?
the related code lines in
src/rust/engine/src/nodes.rs
are:
i have no idea about this code....
w
thanks a lot for investigating! will take a look at that one later today.
👀 1
r
Since I'm in UTC+9, I will come back after the night!