I have a (2x so far) reproducible case of <https:/...
# development
e
I have a (2x so far) reproducible case of https://github.com/pantsbuild/pants/issues/15312 in scie-pants CI for Pants 2.16.0.dev1+git298409b3 (298409b3a4d7914c29a3bea5098b55698967f658). This is not using --no-pantsd which some issue work that closed this issue out as dup seems to focus on.
Seen here: + https://github.com/pantsbuild/scie-pants/actions/runs/3678739650/jobs/6222363684#step:7:464 + https://github.com/pantsbuild/scie-pants/actions/runs/3679867403/jobs/6224812319#step:8:463 Looks like:
Copy code
...
>> Verifying PANTS_SHA is respected
Bootstrapping Pants 2.16.0.dev1+git298409b3 using cpython 3.9.15
Installing pantsbuild.pants==2.16.0.dev1+git298409b3 into a virtual environment at /home/runner/.cache/nce/638e71475c6feb8228292d052407a5cf6b7813caf3630e80d42711d1081e8c1d/bindings/venvs/2.16.0.dev1+git298409b3
New virtual environment successfully created at /home/runner/.cache/nce/638e71475c6feb8228292d052407a5cf6b7813caf3630e80d42711d1081e8c1d/bindings/venvs/2.16.0.dev1+git298409b3.
21:30:47.58 [INFO] Initializing scheduler...
21:30:48.17 [INFO] Scheduler initialized.
2.16.0.dev1+git298409b3
Fatal Python error: PyGILState_Release: thread state 0x7f156c0010d0 must be current when releasing
Python runtime state: finalizing (tstate=0x14a06a0)

Thread 0x00007f1585b0ac00 (most recent call first):
<no Python frame>
Error: Command "dist/scie-pants-linux-x86_64" "--no-verify-config" "-V" failed with exit code: None
Error: Process completed with exit code 1.
I'm not sure this is that useful ... I do not hit the issue on my Linux box, only GH actions Ubuntu 20.04 CI runs so far.
b
I feel like I ran into something similar a while back. I'll respond when I get time if I find anyhting useful
e
And its not exercising the run goal either, just
./pants -V
Yeah, @bitter-ability-32190 I was referring to your close of that issue as dup and then further work Stu did.
Your issue didn't look exactly like #15312 - but very close. Mine appears to present identically to #15312.
I'm probably going to get lazy and just pick any other older sha than 298409b3a4d7914c29a3bea5098b55698967f658 to test PANTS_SHA with, but it appears badness with blocking is back / whac-a-mole continues. I'll record this all somewhere on the Stu issues.
w
is
pantsd
exiting? IIRC, this is related to teardown of the interpreter (the “finalizing” bit)
but … i believe that i have also seen this on macOS for the PEX building command from the release process (
./build-support/bin/release.sh build-universal-pex
), in a similar spot.
e
It is during teardown from the looks of the output I included above.
./pants -V
and it prints out the version successfully and then the CPython thread state error.
So this is nice in minimal, although only remotely reproducible so far.
And I've only seen it 2x and only on Ubuntu 20.04 in CI and only on that PANTS_SHA version. 2.14.0, 2.12.1 and 1.30.5rc1 are all stable.
w
It is during teardown from the looks of the output I included above.
./pants -V
and it prints out the version successfully and then the CPython thread state error.
with
pantsd
enabled,
./pants -V
shouldn’t actually exit
pantsd
. so unless it is the client crashing…?
e
Yeah, should be the client.
I just run a series of -V against different Pants versions
w
weird. but yea, that would explain why it was in the foreground.
e
This is #3 of 4 serial runs.
You're on the review that has the test harness.
w
i have no idea what the client might be doing asynchronously… but it would be a new class of race condition than is described in https://github.com/pantsbuild/pants/issues/16105, since the client shouldn’t actually have an Executor to worry about, afaik.
…but… switching to the native client would definitely resolve it. no more Python on the client =P
e
Boo
Dodge
(e)VADE
w
“Multiple Birds with One Stone”
is how i like to think about it 😃
or “Cutting the Gordian Knot” as Benjy referred to it recently re:
scie
but … i believe that i have also seen this on macOS for the PEX building command from the release process (
./build-support/bin/release.sh build-universal-pex
), in a similar spot.
this was unrelated, (un)fortunately. but is fixed here at least: https://github.com/pantsbuild/pants/pull/17785
e
Yeah. But it's an ongoing issue in not the client. We all have 0 time. I'll try to dig deeper here this week.
w
But it’s an ongoing issue in not the client.
what do you mean by “not the client”?
e
This is still whac-a-mole over in pantsd right? That's why you have a ticket.
Its a general problem with no clean solution / guard rails yet IIUC at a high level.
w
in any case: will review the scie stuff tonight, and can look at anything you file here.
e
I agree it's not Pantsd, I thought it was an instance of a problem we do have in pantsd which your ticket tracks.
IOW, we can punt the client as you suggest since it will die some day, but pantsd will not so investment in me tracking down what why how for this should be non-throwaway effort.
w
but pantsd will not so investment in me tracking down what why how for this should be non-throwaway effort
maybe. i think that the
pantsd
side of things is relatively well understood: we absolutely spawn tasks into the background there which run python code, and so there is a teardown question. how that might be happening on the client side is more mysterious to me
but yea, i don’t know what we don’t know.
e
Ok, well, any mystery in code is bad. So I will try to track this down.
👍 1
I personally get the horrors with stuff like this until I understand fully.