I'm using Pants 2.12 in a self-hosted github runne...
# general
c
I'm using Pants 2.12 in a self-hosted github runner and running in to an issue. I set up python versions for 3.9 and 3.10 on the runner but I get the following error when it runs
./pants generate-lockfiles fmt ::
ProcessExecutionFailure: Process 'Find interpreter for constraints: CPython<4.0,>=3.9' failed with exit code 102.
It seems to think the interpreter is broken:
Copy code
Skipped the following broken interpreters:
1.) /mnt/github_actions_runner/_work/_tool/Python/3.10.6/x64/bin/python3.10:
/mnt/github_actions_runner/_work/_tool/Python/3.10.6/x64/bin/python3.10: error while loading shared libraries: libpython3.10.so.1.0: cannot open shared object file: No such file or directory
2.) /mnt/github_actions_runner/_work/_tool/Python/3.9.13/x64/bin/python3.9:
/mnt/github_actions_runner/_work/_tool/Python/3.9.13/x64/bin/python3.9: error while loading shared libraries: libpython3.9.so.1.0: cannot open shared object file: No such file or directory
Not sure how to fix this, any help would be appreciated
e
Are you using a canned GH action to setup Pythons? Those paths make it seem so.
c
Yeah, I'm using Github's setup python action.
It looks like
toolcache
is a different path for your self-hosted runner, so if you want to contribute back a parameter for that in the actio, that would be great. Perhaps it lives in an env var I'm not aware of. At the very least, you have a technique you can copy and that should work.
c
I'll try this out and see if I can get it working and if so may look at adding the param to the action
e
Um, just a sec.
This is Pants.
Do you expose "LD_LIBRARY_PATH" ?
The GH action probaly sets that up correctly, but Pants by default blocks all env vars.
That's probaly all it is @clever-ghost-87030
c
I'll take a look at that 🙂
e
A good thing to fully take on board. Pants tries to be fully hermetic; so most env vars are masked off, files are snapshotted, etc.
c
So I just need to add that to the list in my envvars.add in pants.toml, right?
e
c
Wouldn't using a pants.ci.toml file increase the risk that the CI file and the non-CI file would get out of sync somehow?
e
The CI toml is additive. Just add or remove config options, don't re-define them all.
h
The typical idiom is for the ci file to be used as well as the non-ci file, so it’s small and only contains the ci-specific tweaks
c
Okay, cool, I'll look in to that then
e
The issue here is LD_LIBRARY_PATH will get mixed into all process cache keys; that means, if you use remote caching, each developer with a different LD_LIBRARY_PATH carves out a different cache space - meaning your devs don't all share the same cache, meaning when Mary gets in at 6am and runs a build, lazy Fred who gets in at 10am pays a penalty and doesn't get a cache hit from Mary's work at 6am. Its not that we set out to reward lazy people, but.
And this is iunfortunate since only CI needs LD_LIBRARY_PATH in this case.
c
Makes sense, I assume I can give the toml file whatever name I like (this is part of our AutoTransform stuff to keep the repo formated and stuff and I'd like the toml file named appropriately for that, since it's not CI in general).
I just have to set PANTS_CONFIG_FILES to the appropriate name
e
Exactly.
c
So I did what was described here and I'm still getting an error:
Copy code
/mnt/github_actions_runner/_work/_tool/Python/3.9.13/x64/bin/python3.9: error while loading shared libraries: libpython3.9.so.1.0: cannot open shared object file: No such file or directory
I even tried adding the exposing code (which seemed to work correctly for exposing it). I will say it's coming now at a different place:
Copy code
Process 'Determine Python dependencies for tests/python/<stuff>/test_<something>.py' failed with exit code 127.
e
Since the runner is self hosted, do you have access to it to play around directly? You're using GH canned actions to download a pre-built Python that clearly works on GH runner images, but maybe not on your GH runner image?
I guess another way to put this is: How are you confident the GH canned action Pythons actually work on your runners? Do you have some other Python process that works using those downloaded interpreters?
c
Python works on the runner, I'm invoking pants through a python script
e
A Python works. Do you know for sure its one of the 2 you get errors for?
c
The original 2 I got errors for were 3.10 and 3.9, the script I'm running that runs pants uses 3.10
e
An it definitely uses the same 3.10?
c
/mnt/github_actions_runner/_work/_tool/Python/3.10.6/x64/bin/autotransform
That's in the stack trace for the AutoTransform script, so it definitely appears to be the same 3.10
When I use your expose code I see:
Exposing /mnt/github_actions_runner/_work/_tool/Python/3.10.6/x64/bin: Python 3.10.6
e
Ok, back to your access. Do you have access to the box? If so I can give you some debug steps. If not it gets more painful.
c
I don't immediately have access, I could probably get it in some time
e
Ok. Short term you can throw in
-ldebug
or (export
PANTS_LEVEL=debug
or
[GLOBAL] level = "debug"
) and
--pex-verbosity=9
(or
PANTS_PEX_VERBOSITY=9
or
[pex] verbosity = 9
) and see if that turns up more useful info in the CI output log.
I mean, on the surface:
Copy code
$ bash foo
bash: foo: No such file or directory
$ echo $?
127
But I don't know what to make of that.
That's the only familiar 127 exit code to me.
Another debugging idea w/o access to the machines is - if they run docker images or on an AMI, spin up the docker image on your own machine (or spin up a machine using the same AMI) and clone your repo in that context and poke. If that's viable I can discuss what you might poke.
c
I mean, I'm still getting
/mnt/github_actions_runner/_work/_tool/Python/3.9.13/x64/bin/python3.9: error while loading shared libraries: libpython3.9.so.1.0: cannot open shared object file: No such file or directory
So it seems like it just can't find this thing?
Information on what to poke could be useful, worst case I'll get access to the machne
So this problem appears to be unique to generate-lockfiles and is related to the determining of python dependencies for our tests
e
You'd run
./pants --no-process-cleanup X
(or
./pants --keep-sandboxes=on_failure X
for bleeding edge Pants versions). That will cause lines like this to print out:
Copy code
$ ./pants --no-process-cleanup test --force src/python/pants/util/strutil_test.py 
16:00:33.83 [INFO] Preserving local process execution dir /tmp/pants-sandbox-g0DNAk for Run Pytest for src/python/pants/util/strutil_test.py:tests
16:00:34.17 [INFO] Completed: Run Pytest - src/python/pants/util/strutil_test.py:tests succeeded.

✓ src/python/pants/util/strutil_test.py:tests succeeded in 0.29s.
You can then run like so to emulate what is happening ~exactly:
Copy code
$ /tmp/pants-sandbox-g0DNAk/__run.sh 
============================================================================================================================== test session starts ==============================================================================================================================
collected 17 items                                                                                                                                                                                                                                                              

src/python/pants/util/strutil_test.py .................                                                                                                                                                                                                                   [100%]

----------------------------------------------------------------------------------------- generated xml file: /tmp/pants-sandbox-g0DNAk/src.python.pants.util.strutil_test.py.tests.xml -----------------------------------------------------------------------------------------
============================================================================================================================== 17 passed in 0.08s ===============================================================================================================================
And then you can start hacking on that
__run.sh
script to debug further.
c
Here's my best guess as to what's happening: Pants is setting up a subprocess for determining the dependencies of tests and in that subprocess it's masking the environment variable. This is the same error we had before we unmasked the environment variable, it's now just happening in a subprocess instead of in the original thing, so I'm suspicious that's what's going on.
e
Aha, you are right: https://github.com/pantsbuild/pants/blob/37bf6071b9dea6074176cbc67e97cdb70fccb71d/[…]ackend/python/dependency_inference/parse_python_dependencies.py It looks like that has been broken since inception in November of 2020. All? other Python processes Pants launch use a
PexProcess
or `VenvPexProcess`and those automatically support
subprocess_environment
env var propagation.
So, the short answer is these GH action canned Pythons won't work until that issue is picked up and fixed. You may need to use pyenv interpreters (https://github.com/gabrielfalcao/pyenv-action) or just massage the machine image to have the Pythons you need pre-installed.
That was a good find @clever-ghost-87030
h
Ooof, sorry for the trouble!
n
I'm facing this problem today, and for irritating unrelated reasons I am not easily able to use pyenv as a replacement for setup-python in my CI runners without some additional pain which I haven't yet got to the bottom of. I'm planning to try to see if @bitter-ability-32190's work to have pants manage its own python runtime might help here, but I'd be interested to know if anybody has other thoughts / if any progress was made on the underlying issue of how these subprocesses are spawned?
v
For those who are suffering from the same issue (using setup-python action with Pants on self hosted Github runner): create a folder at
/opt/hostedtoolcache
, append line
AGENT_TOOLSDIRECTORY=/opt/hostedtoolcache
to the
.env
file where actions runner package is installed and then restart the actions service. I’ve managed to run CI jobs with Pants successfully on our self hosted runners by applying this workaround.
h
Good tip! Would you be able to send a PR that adds this to docs/markdown/Using Pants/troubleshooting.md ?
v
Sure! I’ll file Pull Request and post the link at this thread.
I’ve created a PR which addresses this solution. https://github.com/pantsbuild/pants/pull/18900
🙏 1