How does one setup auth for a `docker_environment`...
# general
s
How does one setup auth for a
docker_environment
? I have a private docker registry and thus setup
docker.env_vars
in pants.toml - using
docker_image
can pull from there just fine. When I try to use
docker_environment
though I get a 500 “no basic auth credentials”. I don’t see anything in the docs about this, or about how to point a
docker_environment
to a local
docker_image
as a workaround.
w
this was recently fixed and cherry-picked to
2.15.x
: sorry about that. will get out new release later today.
you should be able to try out a pre-release of the
2.15.x
branch with
PANTS_SHA=a40b6fa8020aa4d33b8ab959ce493515531aee2c
though (see)
s
awesome! Not awesome there was a bug of course, but glad it wasn’t me using pants wrong 😅 I’ll take a look at the version you linked - thanks!
Ok so I just ran with the sha you linked and it now adds a message
Copy code
14:32:49.67 [INFO] Completed: Pulling Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>` because the image is missing locally.
but then still fails on the same 500 “no basic auth credentials”
w
what is the actual content of the “no basic auth credentials” error?
s
Copy code
$ ./pants test ::   
14:32:49.67 [INFO] Completed: Pulling Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>` because the image is missing locally.
14:32:49.67 [ERROR] 1 Exception encountered:

Engine traceback:
  in `test` goal
  in Run Pytest - (test/test_foobar.py:MyResolve, environment:manylinux)

Exception: Failed to pull image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: Failed to pull Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: DockerResponseServerError { status_code: 500, message: "Head \"https://<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/v2/public/quay.io/pypa/manylinux_2_24_x86_64/manifests/latest\|dkr.ecr.us-east-1.amazonaws.com/v2/public/quay.io/pypa/manylinux_2_24_x86_64/manifests/latest\>": no basic auth credentials" }
That’s what’s printed to the console (with a few redactions, there is an actual account id used), do you want me to do some sort of verbose/debug logging to get more info?
w
one sec
it looks like the “no basic auth credentials” message is being returned by the server.
hm. so in this case, you’re expecting to be using AWS authentication, and so AWS_PROFILE will need to be set…? https://github.com/awslabs/amazon-ecr-credential-helper/issues/207
if that’s the case, i’ll need to look at this a bit more, because i only includelisted variables that were directly used by the docker client, rather than necessarily any used by plugins
s
Correct - I setup auth in
[docker]
as described in the pants docs and
./pants package ::
says it packages the docker image fine, but the error comes up when I try to use a
docker_environment
with the same image in tests
w
so what are the additional environment variables used in `[docker]`…?
s
Not positive how much of this is used, but I ended up with this for the moment:
Copy code
[docker]
env_vars = ["DOCKER_CONFIG=%(homedir)s/.docker"]
tools = [
    "docker-credential-desktop",
    "docker-credential-ecr-login",
    "docker-credential-osxkeychain",
    "dirname",
    "readlink",
    "python3",
    "cut",
    "sed",
    "bash"
]
Mainly grabbed from the docs - I haven’t tried removing some to see what breaks
w
hm. that seems like it should mean that no additional environment variables are necessary (
AWS_PROFILE
isn’t being passed there). @curved-television-6568: do you have any ideas about this one? there aren’t any additional environment variables being propagated to the docker plugin backend by default, are there?
s
Looks like I do also have this:
Copy code
[test]
extra_env_vars.add = [
    'EXTRA_ARGS',
    'CODEARTIFACT_AUTH_TOKEN',
    'AWS_ACCESS_KEY_ID',
    'AWS_SECRET_ACCESS_KEY',
    'EC2_REGION',
    'EC2_ACCOUNT',
    'AWS_DEFAULT_REGION',
    'AWS_AVAILABILITY_ZONE',
    'AWS_TOKEN',
    'PYTHONPATH',
    'PYTHONDONTWRITEBYTECODE',
]
Not sure if that affects anything to do with docker though, being in the
[test]
block
r
You need to follow the steps as mentioned here https://github.com/awslabs/amazon-ecr-credential-helper#docker
And you will have to pass inside
[docker].env_vars
Copy code
'AWS_ACCESS_KEY_ID',
    'AWS_SECRET_ACCESS_KEY',
w
Mm. Ok. I think that that means https://github.com/pantsbuild/pants/pull/18465 will need further changes to support this type of auth. Will look at that today.
s
@refined-addition-53644
docker_image
works using auth through
docker-credential-ecr-login
and
~/.docker/config.json
, without setting
AWS_ACCESS_KEY_ID
or
AWS_SECRET_ACCESS_KEY
. Are you saying this cannot be done with
docker_environment
?
r
But those envs are most probably not available inside
docker_environment
. Locally they are
c
seems you’ve figured this one out. and I agree that we likely need to be able to punch holes for just about any env var (perhaps configurable?)
w
mmmm. so. this looks like a brand new issue that probably does not have anything to do with environment variables. rather:
ecr
login uses a credential helper process: https://docs.docker.com/engine/reference/commandline/login/#credential-helper-protocol … and AFAICT, the crate that we’re using does not currently support interacting with them. i’m going to open a new issue for this one.
🙏 1
s
@witty-crayon-22786 I see you closed that issue based on a PR going in - if I set
PANTS_SHA=b23df09279ee61b06f8e64b65a97f4be17665231
though and run
./pants test ::
I still get this:
Copy code
$ ./pants test ::
14:15:57.22 [INFO] Completed: Pulling Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>` because the image is missing locally.
14:15:57.23 [ERROR] 1 Exception encountered:

Engine traceback:
  in `test` goal

IntrinsicError: Failed to pull image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: Failed to pull Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: Failed to retrieve credentials for server `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com|dkr.ecr.us-east-1.amazonaws.com>`: Credential helper returned non-zero response code
Do your changes require I do anything more to setup docker’s aws auth with pants?
w
hm. is that actually a public repository, or is it a private mirror of a public repository?
s
It’s a private repository
w
and you’ve done the
aws ecr get-login-password
dance to auth recently?
aws ecr get-login-password | docker login --username AWS --password-stdin $repository
?
s
yes, and to verify this if I run
docker system prune -a --volumes
to clear out the images and
docker compose up
without using pants, it authenticates and downloads just fine
w
ok… let me get you a build that includes some more debug output. thanks for the report.
s
👍
w
ok, can try
PANTS_SHA=b147c5fe98b8ad39566c31cc51303830901153ed
from https://github.com/pantsbuild/pants/commits/stuhood/debug-docker-auth once the wheels shards go green there (about 30 minutes probably).
s
Just my luck - all of the wheels are done building except the one I need….which hasn’t started 😭
@witty-crayon-22786 the wheel is built and here’s my new logs:
Copy code
$ ./pants test ::
17:37:39.93 [INFO] Completed: Pulling Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>` because the image is missing locally.
17:37:39.94 [ERROR] 1 Exception encountered:

Engine traceback:
  in `test` goal

IntrinsicError: Failed to pull image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: Failed to pull Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: Failed to retrieve credentials for server `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com|dkr.ecr.us-east-1.amazonaws.com>`: Credential helper returned non-zero response code:
stdout:


stderr:
Failed to fire hook: while creating logrus local file hook: user: Current requires cgo or $USER set in environment
[2023-03-24T21:37:39.934745000Z][docker-credential-desktop][F] user: Current requires cgo or $USER set in environment
[common/pkg/paths.Home()
[       common/pkg/paths/paths.go:108 +0x6d
[common/pkg/paths.Container()
[       common/pkg/paths/user_darwin.go:30 +0x1d
[common/pkg/paths.Data()
[       common/pkg/paths/paths_darwin.go:27 +0x19
[common/pkg/paths.setCurrentDirectory()
[       common/pkg/paths/paths.go:61 +0x1d
[common/pkg/paths.Init(0x0?)
[       common/pkg/paths/paths.go:45 +0x1e
[main.main()
[       common/cmd/docker-credential-desktop/main.go:50 +0x2e
w
Alrighty then... needs $USER apparently. Will push another edit.
PANTS_SHA=cb2e850e3f8bbeb83803b563a9a1b21028b79ede
from the same branch. while that builds, you might try with
--no-pantsd
(which should have all environment variables present)
s
That hash worked!
Thanks for jumping on this so fast! 🙏
w
thanks for the report! i’ll clean it up and get it landed on monday.
🛬 1
s
@witty-crayon-22786 did this ever get landed? Setting the sha manually still works, but I tried a couple dev/rc versions and they break for different reasons. Not sure if they're related reasons though - would be helpful to know which version should be working
w
the latest 2.15.x (
2.15.1rc3
) and 2.16.x (
2.16.0rc2
) rcs should contain all known fixes
there are two open issues though, which might describe what you have going on: https://github.com/pantsbuild/pants/issues/18889 and https://github.com/pantsbuild/pants/issues/18915
s
I stand corrected - I was being dumb 😅 The error message explained that I was doing a thing wrong and I mis-read it as something else. Sorry about that!
2.15.1rc3
is working
w
pheew 😃
s
Hey @witty-crayon-22786, sorry to keep bothering you but I have another issue that I'm hoping is just me being dumb again, or that you've seen before. I'm on
2.15.1
now (I saw that just got released) and it's working great on Mac. I'm now trying to get setup with CICD though which runs on a Linux box but I'm getting an issue:
Copy code
[2023-05-22T18:11:51.316Z] Exception: Failed to pull image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: Failed to pull Docker image `<ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64:latest>`: DockerResponseServerError { status_code: 404, message: "pull access denied for <ACCOUNT_ID>.<http://dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64|dkr.ecr.us-east-1.amazonaws.com/public/quay.io/pypa/manylinux_2_24_x86_64>, repository does not exist or may require 'docker login': denied: User: arn:aws:sts::<ID>:assumed-role/jenkins-2/i-<ID> is not authorized to perform: ecr:BatchGetImage on resource: arn:aws:ecr:us-east-1:<ACCOUNT_ID>:repository/public/quay.io/pypa/manylinux_2_24_x86_64 because no resource-based policy allows the ecr:BatchGetImage action" }
The job has been authenticated with
docker login
and works fine if I pull a docker image using docker, but Pants isn't able to pull it. I tried your previous hash with the extra logging and didn't get anything additional. I did get it working fine using
--no-pantsd
, so my guess is it is a missing environment variable? I haven't been able to figure out which one it might be though. I can use
--no-pantsd
going forward if I have to, but that seems like a bad solution
w
figuring out which environment variable is missing will be necessary, yea. if you could examine
env
and determine what isn’t present here that needs to be, that would help: https://github.com/pantsbuild/pants/blob/ae520ccfc86378dac3bbc651e322c41e8cdfaa05/src/python/pants/pantsd/pants_daemon.py#L39-L52
s
How should I go about white-listing / adding a variable? I ran
printenv
on the CI machine and added every variable name in the list to
[docker].env_vars.add
but it still failed with the same issue. Is there somewhere else I should whitelist environment variables?
w
You can't currently: this would be visible inspection. I'll likely add a facility to include list variables after this though.
s
ahh, so I'm not being completely dumb then haha unfortunately I'm no expert at docker and its auth methodology - I could make some guesses but they'd be just guesses. I was hoping I could just start with adding everything and eliminate them until it stops working
how difficult would it be to setup my own fork of Pants to run on CI to attempt this? I know when it's on the official repo I can just give it a sha to pull from, is there a similar approach for forks?
r
So the CI has appropriate permissions to pull without pants?
s
yes
and has the appropriate permissions to pull with pants if
--no-pantsd
is used
r
So in that case it's reading some cached credentials? Would pants cache aws login stuff? Thinking out loud.
s
It's running in CICD and has been consistent behavior since last week, so a cache seems unlikely to be the cause
👍 1
I don't know whether pants would try to cache aws login stuff though - Stu would be able to answer that better than I could
@witty-crayon-22786 I've narrowed it down to where removing any other variables makes docker fail without pants, so as far as I can tell this is the minimum:
Copy code
AWS_SESSION_TOKEN
PATH
SHELL
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
PATH and SHELL seem obvious (PATH is even in the list you linked me) so the AWS ones seem to be what we want
Given that they're AWS specific, I don't know if it's best to put them in the list you linked or if there's some alternative - a plugin maybe? What do you think is the best way to proceed?
w
i started a thread about it in #development… i think that it will require changes to how we invoke the docker client: https://pantsbuild.slack.com/archives/C0D7TNJHL/p1684784924506069
i’ll try to look at it this week.
s
Ah, I didn't see that (not in that channel) - thanks for following up! Please let me know if I can help in any way