gray-shoe-19951
04/03/2023, 9:41 PM[2023-04-03T19:05:54.375Z] 19:05:51.57 [INFO] Long running tasks:
[2023-04-03T19:05:54.375Z] 863.93s Determine Python dependencies for X1.py
[2023-04-03T19:05:54.375Z] 864.06s Determine Python dependencies for X2.py
[2023-04-03T19:05:54.375Z] 865.21s Determine Python dependencies for X3.py
[2023-04-03T19:05:54.375Z] 865.24s Determine Python dependencies for X4.py
[2023-04-03T19:05:54.375Z] 867.84s Determine Python dependencies for X5.py
[2023-04-03T19:05:54.375Z] 867.85s Determine Python dependencies for X6.py
[2023-04-03T19:05:54.375Z] 867.97s Determine Python dependencies for X7.py
2. if I restart the builds, sometime it would go away
11:58:10 360.38s Test binary /bin/python.
11:58:10 360.38s Test binary /data/env/py3.9.13/bin/python.
11:58:10 360.38s Test binary /opt/conda/bin/python.
For both scenarios, I saw multiple pantsd exists. For example.
sh-4.2# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 17:08 pts/0 00:00:00 /usr/bin/dumb-init -- /usr/local/bin/run-jnlp-client 03564869d53ea68cd383a448680f9abfa4cc44fcd3f0480712dad2283953ec15 pan
root 7 1 2 17:08 ? 00:01:25 java -XX:+UseParallelGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90
root 803 1 0 17:10 ? 00:00:00 sh -c ({ while [ -d '/tmp/workspace/script@tmp/durable-1914e334' -a \! -f '/tmp/workspace/ar_AT
root 804 803 0 17:10 ? 00:00:01 sh -c ({ while [ -d '/tmp/workspace/script@tmp/durable-1914e334' -a \! -f '/tmp/workspace/ar_AT
root 805 803 0 17:10 ? 00:00:00 sh -xe /tmp/workspace/ar_ATOMFM-327_single_eval_script@tmp/durable-1914e334/script.sh
root 820 805 0 17:10 ? 00:00:00 /home/jenkins/.pex/venvs/0bd641e3a90c5dabea350e64a646029c08613838/779eb2cc0ca9e2fdd204774cbc41848e4e7c5055/bin/python /tm
root 822 820 0 17:10 ? 00:00:00 [/home/jenkins/.] <defunct>
root 823 1 16 17:10 ? 00:10:40 pantsd [/tmp/workspace/ar_ATOMFM-327_single_eval_script]
root 1118 823 0 17:11 ? 00:00:00 pantsd [/tmp/workspace/ar_ATOMFM-327_single_eval_script]
root 1119 823 0 17:11 ? 00:00:00 pantsd [/tmp/workspace/ar_ATOMFM-327_single_eval_script]
root 1120 823 0 17:11 ? 00:00:00 pantsd [/tmp/workspace/ar_ATOMFM-327_single_eval_script]
root 1121 823 0 17:11 ? 00:00:00 pantsd [/tmp/workspace/ar_ATOMFM-327_single_eval_script]
root 4778 0 0 18:14 pts/1 00:00:00 sh
root 4785 804 0 18:14 ? 00:00:00 sleep 3
root 4786 4778 0 18:14 pts/1 00:00:00 ps -ef
I tried to disable pantsd in jenkins, it does not help. Instead of multiple pantsd, I would see the following for example.
root 798 0.0 0.0 0 0 ? Z 15:03 0:00 [python] <defunct>
root 799 0.0 0.0 0 0 ? Z 15:03 0:00 [python] <defunct>
My feeling is that somehow the child processor go stuck, I am wondering how can I troubleshoot it and I cannot reproduce it locally. Your advice will be greatly appreciated.witty-crayon-22786
04/03/2023, 11:14 PMgdb
on the host, attaching to the pants
(or pantsd
) process, and then running thread apply all bt
might get us useful data.gray-shoe-19951
04/04/2023, 8:12 PMwitty-crayon-22786
04/04/2023, 8:33 PMgray-shoe-19951
04/04/2023, 8:34 PM