I am having trouble getting a new CI machine to ru...
# general
j
I am having trouble getting a new CI machine to run a pants test. The tests run successfully when I use the agent on my mac. When I run it on a fully patched Ubuntu 18.04 I get "Code 13: Permission denied":
Copy code
buildkite-agent@raulcicd:~/builds/raulcicd-chartbeat-net-1/chartbeat/lint-check$ /var/lib/buildkite-agent/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/bin/pants --level=debug version
Scrubbed PYTHONPATH=/home/ubuntu/superfly from the environment.
10:22:04.81 [DEBUG] acquiring lock: <pants.process.lock.OwnerPrintingInterProcessFileLock object at 0x7faddd0b92e8>
10:22:04.81 [DEBUG] releasing lock: <pants.process.lock.OwnerPrintingInterProcessFileLock object at 0x7faddd0b92e8>
10:22:04.81 [DEBUG] connecting to pantsd on port 44973 (attempt 1/3)
Failed to launch child `/var/lib/buildkite-agent/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/bin/pants`: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }
I thought it was apparmor at first, but I think I eliminated that as a cause by successfully running scripts in the same directory.
The
pantsd
daemon is running as buildkite-agent:
buildki+ 31558  0.1  2.3 259972 46740 ?        Sl   10:19   0:00 pantsd [/var/lib/buildkite-agent/builds/raulcicd-chartbeat-net-1/chartbeat/lint-check]
aaahh. The PID of the
pantsd
it is trying to connect to is different then the one that is running.
Running with
--no-pantsd
works for
pants version
.
👍 1
This might be related or a new problem, but it is having trouble finding the correct interpreters:
Copy code
Exception message: 1 Exception encountered:

  ProcessExecutionFailure: Process 'Find interpreter for constraints: CPython>=3.6' failed with exit code 102.
stdout:

stderr:
Could not find a compatible interpreter.

Examined the following interpreters:
1.) /usr/bin/python2.7 CPython==2.7.17

No interpreter compatible with the requested constraints was found:
  Version matches CPython>=3.6



(Use --print-stacktrace to see more error details.)
buildkite-agent@raulcicd:~/builds/raulcicd-chartbeat-net-1/chartbeat/lint-check$ python3.6
Python 3.6.9 (default, Oct  8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
From `pants.toml`:
Copy code
interpreter_constraints = ["CPython>=2.7.17,<3","CPython>=3.6.9,<4"]
interpreter_search_paths = ["<PYENV>", "<PATH>"]
h
It’s fine to not use Pantsd in CI, but Stu might want to ask about that issue Hmm, if possible, it’s helpful to run with
-ldebug
to see what Pants is choosing.
e
Since
raulcicd
sees python3.6 (presumably that's on the
PATH
for
raulcicd
) and
buildkite-agent
doesn't, that provides the 1st clue.
👍 1
j
I'm doing tests as the
buildkite-agent
user.
e
Ah yes.
j
It's path includes
/usr/bin/
which is where
python
(aliased to
python2
),
python2
,
python3
and
python3.6
are all located.
When I run with
--pantsd
I get the dreaded `103850.41 [DEBUG] connecting to pantsd on port 44973 (attempt 1/3) Failed to launch child `/var/lib/buildkite-agent/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/bin/pants`: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }` error.
Not sure if the two failures are related.
One is a failure to connect to the launched
pantsd
and one is the failure to find
python3.6
.
Here is the debug output.
e
Why do you assert "failure to connect"? That reads more like failure to run a binary.
j
Good point. I was using the debug line before as my guide (connecting to...)
e
What are the perms on that file?
j
Copy code
ts/setup/bootstrap-Linux-x86_64/2.0.0_py36/bin/pants
  File: /var/lib/buildkite-agent/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/bin/pants
  Size: 306       	Blocks: 8          IO Block: 4096   regular file
Device: 10302h/66306d	Inode: 4614039     Links: 1
Access: (0775/-rwxrwxr-x)  Uid: (  999/buildkite-agent)   Gid: (  999/buildkite-agent)
Access: 2020-11-09 23:15:43.856779291 -0500
Modify: 2020-11-09 23:15:43.848783291 -0500
Change: 2020-11-09 23:15:43.848783291 -0500
 Birth: -
Executable for all
let me try as root...
e
Can you
head -1
that "binary"?
j
root seemed to be able to run it but got
no BUILDROOT
type errors.
/var/lib/buildkite-agent/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/bin/pants
is not a binary.
Copy code
#!/var/lib/buildkite-agent/.cache/pants/setup/bootstrap-Linux-x86_64/pants.0KECIN/install/bin/python3.6
# -*- coding: utf-8 -*-
import re
import sys
from pants.bin.pants_loader import main
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())
e
This my quotes.
j
oh. 🙂
e
And so is that shebang binary executable?
j
yes
drops me into a python3 (3.6.9) repl
I'm doing all these tests as the
buildkite-agent
.
I have not made any changes to apparmor at this point. But I did try to turn it off yesterday when I was troubleshooting and that did not help.
e
Dropping back a bit. It sounds like the binaries needed are on PATH. Is there an intent to use PYENV in ci / are pyenv interpreters installed on ci machines?
j
No. This is an ubuntu box and I have not installed pyenv.
e
Ok, so that config is noisy for ci but harmless.
👍🏽 1
j
Our automation installed python2.7 and python3.6-dev.
e
Can you import distutils in that python3.6?
Ah, that's not it.
j
yes
e
Was thinking of https://github.com/pantsbuild/pex/issues/1027 which presents differently.
j
strange
in the debug line where it says, "searching for python3" it is then followed by a test of python2
must be parallel process stuff. I see where it completes the python3 tests.
h
Yeah, we look for any of
python
,
python2
, and
python3
for the interpreter to run Pex, and test that they’re valid binaries. That all happens in parallel.
j
but if it is finding the python3 interpreter, how come it fails later saying "No interpreter compatible with the requested constraints was found: Version matches CPython>=3.6"?
And is this two different issues or one?
h
Cool, you’re right that Pants is able to find an interp to run Pex with. There are two different context we look for an interpreter: 1. what to run the Pex CLI tool with, which can be any valid 2.7 or 3.5+ interp 2. what to use for your own code, which we delegate to Pex because it understands things like interpreter constraints #1 is working as we’d expect, but #2 is not for some reason. That error message you’re getting is coming from Pex. And it looks like Pants is properly setting everything for Pex, so I suspect there could be an issue with Pex itself
e
@jolly-midnight-72759 can you gather debug output again but this time with
--pex-verbosity=9
?
h
John that won’t work, unfortunately. It’s not wired up properly. I’m pushing the cherry-pick of the fix right now
j
So no need for
pex-verbosity
?
e
Aha. Ok, @jolly-midnight-72759 do you have the ability to download pex on that machine? Say via curl?
j
I do. This is a POC box so I can do ANYTHING bwahahahah. But I need to step AFK for about 90 min.
e
K, I'll write up what you can try when you get back.
Ah, nm. I'm always too slow to pull out the right tool for this sort of debugging. With this Dockerfile:
Copy code
FROM ubuntu:18.04

RUN apt update && apt upgrade -y && apt install -y locales language-pack-en
ENV LANG=en_US.UTF-8 LANGUAGE=en_US:en LC_ALL=en_US.UTF-8

RUN apt install -y curl python python3-dev git vim build-essential unzip tar
I reproduce using:
Copy code
$ docker run --rm -it raul:3.6
root@f1a736edc580:/# git clone <https://github.com/pantsbuild/example-python>
Cloning into 'example-python'...
remote: Enumerating objects: 329, done.
remote: Total 329 (delta 0), reused 0 (delta 0), pack-reused 329
Receiving objects: 100% (329/329), 74.15 KiB | 1.65 MiB/s, done.
Resolving deltas: 100% (197/197), done.
root@f1a736edc580:/# cd example-python
root@f1a736edc580:/example-python# ./pants fmt lint typecheck test ::
...
New virtual environment successfully created at /root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36.
17:11:46.96 [INFO] initializing pantsd...
17:11:47.83 [INFO] pantsd initialized.
17:11:48.02 [INFO] No pyenv binary found. Will not use pyenv interpreters.
17:11:54.18 [INFO] Completed: Building docformatter.pex with 1 requirement: docformatter>=1.3.1,<1.4
17:11:54.53 [INFO] Completed: Format with docformatter - made no changes.
17:11:54.88 [ERROR] Exception caught: (pants.engine.internals.scheduler.ExecutionError)
  File "/root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/lib/python3.6/site-packages/pants/bin/local_pants_runner.py", line 289, in run
    engine_result = self._run_v2()
  File "/root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/lib/python3.6/site-packages/pants/bin/local_pants_runner.py", line 195, in _run_v2
    return self._maybe_run_v2_body(goals, poll=False)
  File "/root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/lib/python3.6/site-packages/pants/bin/local_pants_runner.py", line 217, in _maybe_run_v2_body
    poll_delay=(0.1 if poll else None),
  File "/root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/lib/python3.6/site-packages/pants/init/engine_initializer.py", line 127, in run_goal_rules
    goal_product, params, poll=poll, poll_delay=poll_delay
  File "/root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/lib/python3.6/site-packages/pants/engine/internals/scheduler.py", line 569, in run_goal_rule
    self._raise_on_error([t for _, t in throws])
  File "/root/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0_py36/lib/python3.6/site-packages/pants/engine/internals/scheduler.py", line 539, in _raise_on_error
    wrapped_exceptions=tuple(t.exc for t in throws),

Exception message: 1 Exception encountered:

  ProcessExecutionFailure: Process 'Find interpreter for constraints: CPython>=3.6' failed with exit code 102.
stdout:

stderr:
Could not find a compatible interpreter.

Examined the following interpreters:
1.) /usr/bin/python2.7 CPython==2.7.17

No interpreter compatible with the requested constraints was found:
  Version matches CPython>=3.6



(Use --print-stacktrace to see more error details.)
root@f1a736edc580:/example-python#
💯 2
j
If it is reproducible do you want me to open an issue?
e
I can take it from here @jolly-midnight-72759 and open the issue. Thank you.
✔️ 1
OK - this is a latent Pex bug triggered by Pants recent switch to using `--python-path`: https://github.com/pantsbuild/pex/issues/1109
h
Ah ha! Benjy was onto something when he found that part concerning. We agreed to not touch it in the PR adding
--python-path
to reduce # of changes
j
w00t! 🦉
👖 1
e
Ok, found those comment @hundreds-father-404 and responded. Things cannot be simplified as suggested, so I cleared that up.
j
Is this also the root cause of the
PermissionDenied
error which started me down this path?
And the reason I don't see this bug on my mac is because
<PYENV>
comes before
<PYPATH>
, yes?
h
I don’t suspect they’re related. cc @witty-crayon-22786 on the PermissionDenied, Only triggers when
--pantsd
is used.
j
So let's start a new thread for
PermissionDenied
. 🛒
👍 1
h
And the reason I don’t see this bug on my mac is because <PYENV> comes before <PYPATH>, yes?
The ordering doesn’t matter for
interpreter_search_paths
. It works on your mac because you have interpreters in two distinct directories, so you’re not hitting this edge case where the sibling interpreters to the current interpreter are not considered. The issue is that CI has everything in a single directory (Which, again, should be fine. Pex bug)
e
@jolly-midnight-72759 instead of a new thread for perm denied, how about you giving us a Dockerfile we can repro that with?
j
I'm running this on an EC2.
e
Its going to be awfully far afield and titchy to figure that one out without a repro case.
j
Ok. Let me see what I can do.
❤️ 1
Unfortunately the
ldebug
only shows what I put in the first post of this thread.
let me see if I can reproduce with the example-repo.
yup
so it is probably something to do with my ec2 box
👍 1
w
(i’m low context here… going to bow out of the debugging unless you folks want to pull me back in with a ticket or repro)
e
Yeah. The perm denied is either lack of privilege to fork/exec or lack of privilege to the file you're trying to execute. Not much wiggle room outside that.
j
Sweet. I was able to reproduce with your
Dockerfile
.