https://pantsbuild.org/ logo
#development
Title
# development
w

witty-crayon-22786

09/21/2022, 7:44 PM
@curved-television-6568: continuing debugging of the hung test:
it looks like the test is hung about where we would expect: inside the scheduler.
c

curved-television-6568

09/21/2022, 7:45 PM
OK, we can keep the convo here, I’ll paste lengthy non-redacted stuff in DM
w

witty-crayon-22786

09/21/2022, 7:46 PM
to get more information, adding the
@logging(level=LogLevel.DEBUG)
decorator to the test and adjusting how you are running
pytest
to not capture stdio might get you more information in the foreground
(thanks again by the way!)
👌 1
c

curved-television-6568

09/21/2022, 7:47 PM
np. is it interesting to know the open files (there were a ton)
lsof -p <pid>
? 🙂 (I’ll run with the above)
w

witty-crayon-22786

09/21/2022, 7:51 PM
um, unclear so far… if it looks like we’re waiting on the network, then it might be?
which reminds me: can you look for child processes of the test process? particularly any python processes?
c

curved-television-6568

09/21/2022, 7:52 PM
yeah, there were none
w

witty-crayon-22786

09/21/2022, 7:52 PM
iiinteresting.
c

curved-television-6568

09/21/2022, 7:52 PM
however, now it failed rather quickly, no hang…
I’ll see if I can revert back
w

witty-crayon-22786

09/21/2022, 7:53 PM
at least in CI, this is flakey, so it might only hang in some cases?
c

curved-television-6568

09/21/2022, 7:54 PM
it was stdio related, when I tried with the pytest option
--capture=tee-sys
now it hangs, but I’m not sure how to get more output during pytest execution runs…
w

witty-crayon-22786

09/21/2022, 7:54 PM
yikes.
um, can you try
lldb -p $test_process_pid
and then
bt all
when it attaches?
👍 1
c

curved-television-6568

09/21/2022, 7:55 PM
tried
-s
but it doesn’t give me any… err, that was yesterday with debug logging on the test it does…
this is the output before it hangs:
Copy code
src/python/pants/backend/python/util_rules/pex_test.py::test_lockfiles 15:55:02.07 [INFO] external invalidation: cleared 0 and dirtied 0 nodes for: {"pex_lock.json", ""}
15:55:02.08 [INFO] external invalidation: cleared 0 and dirtied 0 nodes for: {"", "reqs_lock.txt"}
15:55:02.08 [DEBUG] Launching 1 roots (poll=false).
15:55:02.08 [DEBUG] Starting: pants.backend.python.util_rules.pex.build_pex
ok, wow, yeah bt all gave a lot. you want it all?
w

witty-crayon-22786

09/21/2022, 7:58 PM
yes please
c

curved-television-6568

09/21/2022, 7:59 PM
got it
w

witty-crayon-22786

09/21/2022, 8:07 PM
i believe that the backtrace points to the issue. although i still don’t fully understand it.
basically: one of the threads i see is stuck tearing down the docker CommandRunner… which @fast-nail-55400 just merged a patch to adjust: https://github.com/pantsbuild/pants/pull/16930
c

curved-television-6568

09/21/2022, 8:08 PM
cool. I’ve read a fair share of backtraces, but without a link map and good knowledge of the underlying code they rarely give me much 😛
w

witty-crayon-22786

09/21/2022, 8:09 PM
and staring at the backtrace a bit more gave me another chance to better understand it, so i’ll post on that ticket for posterity.
👍 1
c

curved-television-6568

09/21/2022, 8:09 PM
which @fast-nail-55400 just merged a patch to adjust
sounds interesting. too close to be coincidence?
w

witty-crayon-22786

09/21/2022, 8:09 PM
sorry, to be clear: his patch is intended to fix this
c

curved-television-6568

09/21/2022, 8:10 PM
oh, ok
I’m running off of 86295c1015ae1530b39f7f0af61ec87ebaf139d8 btw…
could try again with tom’s fix in..
w

witty-crayon-22786

09/21/2022, 8:12 PM
yea, that would be good.
c

curved-television-6568

09/21/2022, 8:12 PM
but why was the docker CommandRunner used at all now?
w

witty-crayon-22786

09/21/2022, 8:13 PM
yea, that would be good.
@curved-television-6568: but you could also wait and see whether this test continues flaking… i’m optimistic it won’t
c

curved-television-6568

09/21/2022, 8:13 PM
@fast-nail-55400 🙌 profit! (i.e. it works now)
w

witty-crayon-22786

09/21/2022, 8:13 PM
but why was the docker CommandRunner used at all now?
it will be in the stack, but it should not actually be used in this case.
c

curved-television-6568

09/21/2022, 8:15 PM
ah ’right.
w

witty-crayon-22786

09/21/2022, 8:15 PM
but yea, that raises a question of what exactly it is hung doing. i commented on https://github.com/pantsbuild/pants/pull/16930
👍 1
thanks a lot for investigating!
❤️ 1
👍 1
c

curved-television-6568

09/21/2022, 8:43 PM
my pleasure
f

fast-nail-55400

09/21/2022, 8:48 PM
https://github.com/pantsbuild/pants/pull/16951 should fix looking for Docker even though Docker was never used.
💯 1
w

witty-crayon-22786

09/21/2022, 9:16 PM
thank you!