I'm trying to write some test code to ensure that ...
# general
s
I'm trying to write some test code to ensure that we do proper cleanup on a SIGTERM. I have a simple script that calls the offending code which I convert into a pex. I then Popen the pex in a test and assert some output occurs in stderr that runs during the cleanup. I'm using
runtime_package_dependencies
in my
python_tests
and this is working well. If I run
./pants test ::
it works locally. However it fails in CI with the error message
AssertionError: assert 'Received SIGTERM, shutting down\n' in ''
. What's strange is that this was happening locally until it just didn't. Who knew that if you tried the same thing 5x times in a row it'd eventually start working... There's similar logic for SIGINTs and if I run the pex directly I can confirm that the logs are indeed being written (I've checked stdout and stderr in the test and they're both empty strings). Any idea where the output is going?
I can ssh into my CI environment and run the pex there and confirm the logs are writing to stdout/stderr
p
This sounds like a race condition in your test code. I have seen similar sorts of subprocess-related errors that only show up in CI, especially when the CI server is busy.
s
I think you're correct
e
What version of Python?
s
3.10.8 but the pytest tool might be different - let me double check - ok also 3.10
e
Ok. STDERR buffering was fixed in 3.10 according to my comment there some time ago.
I cannot remember my research. I should've linked my source in the comment.
p
are you using
Popen.communicate()
to get stderr?
s
I am
I have code that looks like
Copy code
os.kill(process.pid, signal.SIGTERM)
time.sleep(1)

if process.poll() is None:
    pytest.fail("Process did not exit on SIGTERM")
which should probably be enough
but I added after*
Copy code
_, stderr = process.communicate()
stderr = stderr.decode("utf-8")
assert "Received SIGTERM, shutting down\n" in stderr
so I don't fool myself Edit - Looks like I can actually
Copy code
_, stderr = process.communicate()
instead of sleeping as it waits for process to terminate.
When stderr isn't
''
it's something like
'Ignoring the following environment variables in Pex venv mode:\n_ _PEX_EXTRA_SYS_PATH__=.:archipelago/src:workflows/src\n\n'
If I remove
execution_mode="venv"
from my
pex_binary
I see `''`s locally. There are a few tests and there are other stderrs like
Traceback (most recent call last):\n  File "/Users/yjabri/.pex/installed_wheels/55a395cdfd3f3dd3f649131d41f97c17b4ed8....py3-none-any.whl/pyasn1/type/univ.py", line 2201, in __init__\n    def __init__(self, **kwargs):\nKeyboardInterrupt\n
If I ``--keep-sandboxes=on_failure`` when it fails, whats odd is that going to the sandbox and running
./__run.sh
works...
e
So - re-reading the OP. A PEX file, when it is run the 1st time, no matter if its venv mode or not, unpacks itself and then re-execs itself. I would think a signal could get lost across a re-exec.
So, on that theory, you'd want to 1st run the pex noop to get it to unpack itself and settle. Then test on the 2nd invocation.
Additionally, if a venv PEX and the python used to run the PEX != to the venv python, it will re-exec every time (costs ~50ms).
One way to work around these re-execs would be to wait to read some info that lets you know the binary is in your code, and then send the signal.
Not sure if there is natural output like that to read.
s
so if I understand my BUILD looks like
Copy code
python_sources(
    name="tests0",
)

pex_binary(
    name="foobar",
    dependencies=[
        "workflows/tests/dummy_cron_listener.py:tests0",
        "workflows/tests/dummy_http_listener.py:tests0",
    ]
)

python_tests(
    dependencies=[
        "workflows/tests/dummy_cron_listener.py:tests0",
        "workflows/tests/dummy_http_listener.py:tests0",
    ],
    runtime_package_dependencies=[
        ":foobar"
    ]
)
and my test looks like
Copy code
subprocess.Popen(["workflows.tests/foobar.pex", "-c", ""])

    process = subprocess.Popen(
        ["workflows.tests/foobar.pex", "workflows/tests/dummy_cron_listener.py"],
        stderr=subprocess.PIPE,
        # env=env,
    )
    time.sleep(1)

... rest of the test
but I still get
AssertionError: assert 'Received SIGTERM, shutting down\n' in ''
e
Ok. That was my ~only idea. When I have to do timing dances in tests its generally with something like a proxy server or other server that emits a pid file and I can use that as the rendezvous point to know the thing is up.
But another thing you can try is to have the signal handling code read an env var and in tests tell it to output to a file instead of stderr.
You then use that file as proof.
Lots of quick things to try if you don't rathole on stderr.
s
Heh, I just need to uncomment
@pytest.mark.skip("Unable to get this to work in CI")
This is maybe kind of dumb but could I
Copy code
experimental_shell_command(
    name="hack",
    command="PEX_TOOLS=1 python workflows.tests/foobar.pex venv --compile ./tmp/foobar",
    tools=[],
    dependencies=[":foobar"],
    outputs=["./tmp/", "foobar"],
)
and then use
./tmp/foobar/bin/python
?
I'm using 2.15 but I can't really get the above to work
in my
python_test
I have
Copy code
runtime_package_dependencies=[
        ":hack"
    ]
but I'm grasping at straws
Going deeper down the rabbit hole
Copy code
experimental_shell_command(
    name="hack",
    command="PEX_TOOLS=1 python3 ../../workflows.tests/foobar.pex venv --compile ./tmp/foobar",
    tools=["bash", "python3"],
    dependencies=[":foobar"],
    outputs=["./tmp/"],
)

python_tests(
    dependencies=[
        ":hack",
        "workflows/tests/dummy_cron_listener.py:tests0",
        "workflows/tests/dummy_http_listener.py:tests0",
    ],
    runtime_package_dependencies=[
        ":foobar"
    ]
)
but
Error expanding output globs: Failed to read link "/private/var/folders/m8/cmsvrd417n39w5xjzjhmnznm0000gq/T/pants-sandbox-Z779e0/workflows/tests/tmp/foobar/bin/python3.10": Absolute symlink: "/opt/homebrew/opt/python@3.10/bin/python3.10"