We had an unfortunate regression in one of our CI ...
# general
h
We had an unfortunate regression in one of our CI systems when running
2.18.0
. We were getting cryptic errors like
Copy code
Failed to exec pants at "/home/buildbot/.cache/nce/3d6643e46b53e4cc0b2a0d5c768866226ddce3de1f57f80c4a02d8d39800fa8e/bindings/venvs/2.18.0/bin/pants": Exec format error
We confirmed there weren't any special characters in the file it was trying to execute down by using
sha256sum
and verifying with a working deployment. We noticed the file was trying to execute as a shell script and not respecting the shebang line. That looked like
Copy code
#!/home/buildbot/.cache/nce/3d6643e46b53e4cc0b2a0d5c768866226ddce3de1f57f80c4a02d8d39800fa8e/bindings/venvs/2.18.0/bin/python3.9 -sE
2.16.0
worked happily and we noticed its shebang looked like this
Copy code
#!/home/buildbot/.cache/nce/3d6643e46b53e4cc0b2a0d5c768866226ddce3de1f57f80c4a02d8d39800fa8e/bindings/venvs/2.16.0/bin/python
We realized that there is a character limit on shebang lines set at the kernel level. On a lot of deployments, this limit is 127 characters (ref). On our user machines, we happen to have a buffer configured to 255 characters limit. It turns out the working case about is 124 characters while the broken version is 132. We're still trying to root cause why the specific machine we have is limiting at a different number than what its buffer suggests it should be, but we wanted to raise awareness for pants developers/users since its clearly flirting with a default limit on linux distributions.
I can file a proper issue if there's consensus that this should drive some pants changes. You're presented with some pretty cryptic errors if you hit this since the kernel just silently truncatest the shebang down to 80 characters if it's over the limit.
b
Ugh. That’s frustrating! Seems like you’ve been caught up in a lot of regressions recently, thanks for catching them. My thought is that we should be doing something here, if 127 is the default limit.
h
Oh, I forgot to mention. We did root cause to this limit since we manually replaced the shebang with something shorter and it allowed the file to be interpreted as a python module correctly.
f
This is not the first time this limit has been reported as an issue: https://pantsbuild.slack.com/archives/C046T6T9U/p1619195435084000
f
the typical solution to this problem is to symlink the executable
e.g.,
Copy code
dir=$(mktemp -d)
executable=$dir/python
ln -s $interpreter_path $dir/python
echo "#!$executable" | cat - $script_content_source > $script_destination
b
I think these scripts are persistent, so a tempdir may not be appropriate. For this particular case, we could cut 20 bytes off the path by using base64 instead of hex. E.g.
#!/home/buildbot/.cache/nce/PWZD5GtT5MwLKg1cdohmIm3c494fV_gMSgLY05gA-o4/bindings/venv/2.18.0/bin/python3.9 -sE
is 110 bytes long, which gives space for longer user name and/or longer binding path
h
An issue would be really great @high-yak-85899, we definitely should address this
🫡 1
w
Whoa this brings me back. One of my first Pants issues was re: shebang length limits https://github.com/pantsbuild/pants/issues/11597#issuecomment-785478666
c