Getting a weird issue in Jenkins with black formatting when Pants #development

Getting a weird issue in Jenkins with black format...

wonderful-iron-54019

06/22/2020, 5:12 PM

Getting a weird issue in Jenkins with black formatting when moving from 1.26.0 => 1.27.0 In 1.26.0 the command ./pants fmt2 src:: works as expected. In 1.27.0 I switched the command to ./pants fmt src:: and it works locally, however in Jenkins the output logs as if completed, and the hangs for up to a day (!) and which point Jenkins reports the following error. Any changes to the output stream or process environment that might cause pants not to 'register' the end of an operation?

Untitled

witty-crayon-22786

06/22/2020, 5:16 PM

hm, sorry for the trouble!

witty-crayon-22786

06/22/2020, 5:16 PM

is it reproducible?

witty-crayon-22786

06/22/2020, 5:19 PM

if so, while it’s sitting there, output from

py-spy dump

or the native backtraces of threads would be very helpful (via https://lldb.llvm.org/use/map.html#examining-thread-state “show the backtraces for all threads”)

wonderful-iron-54019

06/22/2020, 7:07 PM

just saw this, will try.

wonderful-iron-54019

06/22/2020, 7:07 PM

It is reproducible in the CI env

wonderful-iron-54019

06/22/2020, 8:07 PM

been able to reproduce locally within our build container. cannot get

py-spy

to work (Getting permission denied, even on sudo (!) however the commands been running for a lot longer that 29 seconds and it shows

Copy code

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   4504   804 pts/0    Ss   19:47   0:00 /bin/sh
root         6  2.7  3.7 1849066516 150360 pts/0 Sl+ 19:47   0:29 /root/.cache/pants/setup/bootstrap-Linux-x86_64/1.27.0_py37/bin/python /root/.cache/pants/setup/bootstrap-
root      1091  0.0  0.0   4504  1612 pts/1    Ss   19:55   0:00 /bin/sh
root      1130  0.0  0.0  34424  2840 pts/1    R+   20:05   0:00 ps aux

witty-crayon-22786

06/22/2020, 8:08 PM

i suspect that something is deadlocked rather than busy waiting.

witty-crayon-22786

06/22/2020, 8:09 PM

do you think that you could give the gdb/lldb thing a try?

wonderful-iron-54019

06/22/2020, 8:09 PM

the virtual set size is enormous

wonderful-iron-54019

06/22/2020, 8:09 PM

i'll have to build it on the container

witty-crayon-22786

06/22/2020, 8:09 PM

yea, that’s expected. we use LMDB, which mmaps things aggressively

witty-crayon-22786

06/22/2020, 8:09 PM

gdb should be available from your package manager…

wonderful-iron-54019

06/22/2020, 8:09 PM

wonderful-iron-54019

06/22/2020, 8:19 PM

super strange:

wonderful-iron-54019

06/22/2020, 8:20 PM

Copy code

Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.

wonderful-iron-54019

06/22/2020, 8:20 PM

i am root!

wonderful-iron-54019

06/22/2020, 8:20 PM

groot

wonderful-iron-54019

06/22/2020, 8:20 PM

happening with other processes too

witty-crayon-22786

06/22/2020, 8:22 PM

hm. that is unexpected! i don’t know whether the message it logged about ptrace is helpful… it might be? i haven’t experienced a case where gdb can’t attach

wonderful-iron-54019

06/22/2020, 8:28 PM

yeah setting that to the most permissive mode setting still causes issues

witty-crayon-22786

06/22/2020, 8:43 PM

@wonderful-iron-54019: we have one other facility that might work. you can try sending

SIGUSR2

to the process with

kill -s SIGUSR2 $pid

witty-crayon-22786

06/22/2020, 8:43 PM

it should render backtraces to the processes’ stdout… so, jenkins

wonderful-iron-54019

06/22/2020, 8:44 PM

will do, trying to see if it's only the

fmt

goal or others exhibiting this behavior at the moment

witty-crayon-22786

06/22/2020, 8:52 PM

thank you.

wonderful-iron-54019

06/22/2020, 8:55 PM

here;s the output

Untitled

wonderful-iron-54019

06/22/2020, 8:56 PM

Not sure if it matters but it looks that while black is running it spins up multiple processes against the same files?

Untitled

witty-crayon-22786

06/22/2020, 8:59 PM

that’s probably independent, but a new thread about that would be appreciated

witty-crayon-22786

06/22/2020, 9:00 PM

ok, regarding the output from SIGUSR2… that unfortunately confirms that we would need a gdb trace to get more information…

wonderful-iron-54019

06/22/2020, 9:07 PM

ill see what i can do with that

witty-crayon-22786

06/22/2020, 9:08 PM

also, is it an option to run with

-ldebug

in this environment? it will cause a lot more logging to be rendered.

witty-crayon-22786

06/22/2020, 9:08 PM

sorry for the trouble… very interested in tracking this down.

wonderful-iron-54019

06/22/2020, 9:09 PM

yeah sure thing

wonderful-iron-54019

06/22/2020, 9:09 PM

yeah me too

witty-crayon-22786

06/22/2020, 9:11 PM

@wonderful-iron-54019: and to confirm: you folks are not using

pantsd

, correct?

wonderful-iron-54019

06/22/2020, 9:12 PM

not at the moment, unless its turned on by default in 1.27

witty-crayon-22786

06/22/2020, 9:12 PM

it is not. ok.

wonderful-iron-54019

06/22/2020, 9:32 PM

ok well this is certainly odd

pantslog.txt

wonderful-iron-54019

06/22/2020, 9:33 PM

looks like it starts to construct the build graph again after successful completion

witty-crayon-22786

06/22/2020, 9:34 PM

mm… are you using both v1 and v2 ?

wonderful-iron-54019

06/22/2020, 9:34 PM

not in the format step?

wonderful-iron-54019

06/22/2020, 9:34 PM

but yes

wonderful-iron-54019

06/22/2020, 9:34 PM

it looks like this is where it hangs

wonderful-iron-54019

06/22/2020, 9:35 PM

still running the engine scheduler after ~4.5 min

wonderful-iron-54019

06/22/2020, 9:35 PM

the original step took 2.5min

witty-crayon-22786

06/22/2020, 9:35 PM

yea… so re-constructing the build graph might be expected… hanging is most definitely not!

wonderful-iron-54019

06/22/2020, 9:35 PM

(usually much quicker too, i think running a container in my local is slowing it down)

wonderful-iron-54019

06/22/2020, 9:36 PM

i can keep this running and see if it ever gets passed that log

wonderful-iron-54019

06/22/2020, 9:36 PM

its near EOD for me so im going to go afk but i'll come back and c heck on it and report

witty-crayon-22786

06/22/2020, 9:37 PM

i don’t expect it to finish: but thanks for reporting this. i’ll see if i can investigate this based on what you’ve reported.

witty-crayon-22786

06/22/2020, 9:37 PM

have a good evening!

witty-crayon-22786

06/22/2020, 9:49 PM

https://github.com/pantsbuild/pants/issues/10133

wonderful-iron-54019

06/22/2020, 10:03 PM

👍🏼

witty-crayon-22786

06/22/2020, 10:46 PM

It would be worth seeing whether

--no-v1

works around this.

wonderful-iron-54019

06/23/2020, 12:41 PM

just an FYI:

--no-v1

did indeed work!

❤️ 1

wonderful-iron-54019

06/23/2020, 12:41 PM

this is an acceptable workaround for now, since we don't have any v2 formatting targets.

❤️ 1

wonderful-iron-54019

06/23/2020, 12:41 PM

thanks for you help @witty-crayon-22786

hundreds-father-404

06/23/2020, 4:13 PM

I thought you’re using V2 to run Black though? If that’s the case, you could split it up into a distinct V1 run vs V2 run, which is clunky but at least works around it

witty-crayon-22786

06/23/2020, 4:17 PM

@hundreds-father-404: it sounds like that's what he did here, because they're not using any v1 formatters

👍 1

wonderful-iron-54019

06/23/2020, 8:43 PM

yeah that's correct

👍 1

2 Views

Open in Slack

Previous Next