quaint-forest-8735
06/24/2022, 2:07 PMException: Failed to read link "/home/jenkins/agent/workspace/pipeline-name/bazel-pipeline-name": Absolute symlink: "/root/.cache/bazel/_bazel_root/948ee198afa5cf4acd9dcc1262573709/execroot/alpha"
Our CI system uses a bazel base image and the workdir appears to be an absolute symlink to /root/.cache/bazel/...
-- could this be causing strange/unexpected behavior, and has anyone encountered this before/know about possible workarounds?quaint-forest-8735
06/24/2022, 2:09 PMwitty-crayon-22786
06/24/2022, 4:22 PM.gitignore
or pants_ignore
that file: https://www.pantsbuild.org/docs/troubleshooting#pants-cannot-find-a-file-in-your-projectquaint-forest-8735
06/24/2022, 4:30 PMpants_ignore
options in pants.toml
did the trick. Thank you!quaint-forest-8735
06/24/2022, 7:17 PM./pants package
has spent 30 minutes stuck on
18:18:46.16 [INFO] Initializing scheduler...
18:18:46.30 [INFO] Scheduler initialized.
Enabling more verbose logging doesn't seem to have any effect, so I'm scratching my head here. The other odd part is that I can successfully run ./pants package
when I ssh into the build pod (runs in about ~45s)witty-crayon-22786
06/24/2022, 7:23 PM-ldebug
…?quaint-forest-8735
06/24/2022, 7:23 PM--level=trace
witty-crayon-22786
06/24/2022, 7:29 PM-ldebug
should be more than enough… trace
will be overkill.witty-crayon-22786
06/24/2022, 7:30 PMpants[d]
processes actually using CPU, or are they idle?witty-crayon-22786
06/24/2022, 7:30 PMpy-spy
or using linux perf
would helpwitty-crayon-22786
06/24/2022, 7:30 PMgdb
would be next.witty-crayon-22786
06/24/2022, 7:31 PM.pants.d/pants.log
?quaint-forest-8735
06/24/2022, 7:33 PMpantsd
appears to be actually using cpu:
3119 root 20 0 541.3g 113108 19364 S 0.3 0.1 0:03.28 pantsd [/home/j
2. There didn't seem to be anything of note in .pants.d/pants.log
3. When I enabled --level=trace
, it appears to be hanging on this step:
19:30:29.64 [TRACE] Starting: Search for addresses in BUILD files
19:30:29.64 [TRACE] Starting: Snapshotting: BUILD, BUILD.pants
19:30:29.64 [TRACE] Starting: Fingerprinting: BUILD.pants
19:30:29.64 [TRACE] Starting: Fingerprinting: BUILD
witty-crayon-22786
06/24/2022, 7:34 PM1.that’s showing 0.3%, which is effectively idle. so it seems like potentially a deadlockappears to be actually using cpu:pantsd
witty-crayon-22786
06/24/2022, 7:36 PMgdb
and then run:
thread apply all bt
… that would get a thread dump that could maybe point to a deadlockquaint-forest-8735
06/24/2022, 7:43 PMwitty-crayon-22786
06/24/2022, 9:01 PMquaint-forest-8735
06/24/2022, 9:01 PMquaint-forest-8735
06/24/2022, 9:13 PMwitty-crayon-22786
06/24/2022, 9:30 PMquaint-forest-8735
06/24/2022, 9:30 PMwitty-crayon-22786
06/24/2022, 9:31 PM-linfo
(the default), or only at higher levels?quaint-forest-8735
06/24/2022, 9:31 PM-linfo
, -ldebug
, and -ltrace
witty-crayon-22786
06/24/2022, 10:02 PMwitty-crayon-22786
06/24/2022, 10:03 PMquaint-forest-8735
06/24/2022, 10:05 PM./pants package
was being executed in an sh
step in jenkins: https://www.jenkins.io/doc/pipeline/steps/workflow-durable-task-step/#sh-shell-scriptwitty-crayon-22786
06/24/2022, 10:07 PMwitty-crayon-22786
06/24/2022, 10:07 PMquaint-forest-8735
06/24/2022, 10:07 PMquaint-forest-8735
06/26/2022, 8:01 PM./pants package
command to see if this was specific to our Jenkins environment); fixing those warnings resolves the deadlock both in Jenkins as well as in GH Actions.
Interestingly enough, when I run the build at the same commit that introduced the issue which led to all the warnings in a normal shell, ./pants package
succeeds. It appears that in both Jenkins & GH actions, stderr
is either blocked/buffered in some sort of way that leads to the deadlock. Not sure if investigating the deadlock further is relevant, but thought that was an interesting data point.