Hey all, having some trouble getting `pants publis...
# general
p
Hey all, having some trouble getting
pants publish
to work for docker images being published to an AWS ECR repo. I think I got all the environment variables and
$PATH
components wired in correctly, but no dice. I following the recommendation about using
env -i
and actually got that to work, but it's not working through
pants publish
for some reason. I set
--level=trace
in my command, and it seems like I'm missing a ton of info about how
docker push
is getting executed (i.e no indication of the sandbox directory, command args, etc in the log output). Any ideas?
The output error message is
no basic auth credentials
, image builds fine. SO for some reason it seems like an auth issue but I'm stumped as to why my config file setup isn't wiring in the required environment variables
Relevant config section:
Copy code
[docker]

default_repository = "build-system-demo-pants"

env_vars = [
    # "DOCKER_CONFIG=build_support/docker/config",
    "AWS_ACCESS_KEY_ID=<REDACTED>",
    "AWS_SECRET_ACCESS_KEY=<REDACTED>",
    "AWS_ECR_CACHE_DIR=/Users/me/code/build-system-demo/ecr-cache",
    "DOCKER_CONFIG=/Users/me/code/build-system-demo/build_support/docker/config"
]

tools = [
    "docker-credential-ecr-login",
    "sh",
]
f
add
--no-process-cleanup
to preserve the execution sandbox
p
@fast-nail-55400 I did that but it doesn't print any information about an execution sandbox for the push operation
I see this for the build operation:
Copy code
18:49:11.34 [INFO] Preserving local process execution dir /private/var/folders/1r/ndd87ylx3097l5pjbj1pgsch0000gp/T/process-executioniMMn2k for "Building docker image <http://137296740171.dkr.ecr.us-west-2.amazonaws.com/build-system-demo-pants:latest|137296740171.dkr.ecr.us-west-2.amazonaws.com/build-system-demo-pants:latest>"
And if I got there the
__run.sh
script just has the build commands. No similar output or script for the push operation
f
ah publish uses interactive process (via
InteractiveProcess
in Pants code) which means it runs differently than normal execution sandbox processes
basic question: I assume ecr-login is configured as a credential helper?
p
Yeah, here are the contents of the config file:
Copy code
{
  "credHelpers": {
    "<account-id>.<http://dkr.ecr.us-west-2.amazonaws.com|dkr.ecr.us-west-2.amazonaws.com>": "ecr-login"
  }
}
h
cc @curved-television-6568 on this one, sorry Andreas! šŸ™‚
šŸ™ 1
c
So, to trouble shoot this is a little involved, but doable. And it seems @proud-appointment-36730 managed to get it going with
env -i
So my question then would be, what env vars was used in that case? The exact same set of env vars as listed in the
pants.toml
file? What did you provide as
PATH
? I think a key component here may be a mistake to provide the path directly to a binary where there also are other binaries, so those are ā€œleakedā€ into the sandboxed environment. To test this properly, you need a temporary
bin
folder, from which you can link all your binaries/tools and then point your
PATH
to that temp
bin
so all that is visible on the path are explicitly only those linked files and nothing else.
Also, this is relevant: https://github.com/pantsbuild/pants/issues/14596#issuecomment-1142430052 Seems like it would be worth while to implement..
p
@curved-television-6568 I did use the same exact set of env vars, and I did the temporary symlinking of binaries into a temporary folder. Let me send you my command line invocation, one moment
šŸ‘ 1
Script for testing outside of pants:
Copy code
#!/bin/bash

echo "<account-id>.<http://dkr.ecr.us-west-2.amazonaws.com|dkr.ecr.us-west-2.amazonaws.com>" | env -i \
  PATH=/tmp/path-isolated \
  AWS_ECR_CACHE_DIR=/Users/kyle/kairos/build-system-demo/ecr-cache \
  AWS_ACCESS_KEY_ID=<REDACTED> \
  AWS_SECRET_ACCESS_KEY=<REDACTED> \
  DOCKER_CONFIG=/Users/kyle/kairos/build-system-demo/build_support/docker/config \
  docker-credential-ecr-login get
Contents of
/tmp/path-isolated
:
Copy code
$ l /tmp/path-isolated
total 0
lrwxr-xr-x  1 kyle  wheel    42B Jun 22 17:59 docker-credential-ecr-login@ -> /usr/local/bin/docker-credential-ecr-login
lrwxr-xr-x  1 kyle  wheel     7B Jun 22 17:59 sh@ -> /bin/sh
Oh, I didn't provide a
PATH
environment variable in my
pants.toml
file though
c
Oh, I didnā€™t provide a
PATH
environment variable in my
pants.toml
file though
That you should get from the
tools
section for you, so not the issue..
However, adding
PATH
to the env vars would blow a big whole but ought to work, as workaround for now, if nothing else.
Iā€™ll see if I can learn anything about the above..
Huh, Iā€™m stumped. Given the above works, I canā€™t see why it wouldnā€™t when executed from Pants. Will have to try this with a live connection to aws to get any further. Well investigated, Kyle šŸ‘
p
Thanks! Yeah I'm stumped as well. Is there any way to hook into the environment that the
docker push
command is running in and interactively debug? Or could I run some one-off shell commands in the environment the
docker push
command is running in to inspect it? I took a look at the pants code, what's the reasoning for using an
InteractiveProcess
to run the
docker push
? The fact that I can't use
--no-process-cleanup
to drop into the environment and figure out what's going on is making this a little tricky
c
Yeah, Iā€™ve missed the sandbox introspection aspect a few times as well. I canā€™t really recall the real reason for going with
InteractiveProcess
for this atm, perhaps @witty-crayon-22786 has an answer for that.
But what it does is basically just a stripped down env, and then some trickery to setup shim scripts for the tools, to have on PATH. Thatā€™s it. So the debugging setup youā€™ve done already is as close to it as I know.
Adding
PATH
to the
[docker].env_vars
ought to punch through any obstacles thoughā€¦ to get unblocked for now, if acceptable.
f
Is there any way to hook into the environment that the
docker push
command is running in and interactively debug?
It will be more convoluted, but you can by modifying Pants source for the rules to add what options you need. So maybe increase `docker`ā€™s log level for the push to get more info. Checkout a copy of Pants in a sibling directory to your repo and add a
pants_from_sources
script from one of the pantsbuild/example-* repos on GitHub. Then modify https://github.com/pantsbuild/pants/blob/adc76b4dd0b85feb6fbfcb6ed735e292244fe3ff/src/python/pants/backend/docker/util_rules/docker_binary.py#L96 to add
--log-level=debug
option to the push command line. Then run
./pants_from_sources
in your repo instead of
./pants
.
(We should probably add a Pants option to allow setting arguments on the
docker push
command line used. But for now, the above procedure will at least get you more information.)
šŸ‘ 1
c
Actually, to capture/modify what options is used, the env etc, you could create a
docker
shim, put that on your path so Pants picks it up instead of the real docker binary. That way you can debug everything via that shim.
using
PANTS_DOCKER_EXECUTABLE_SEARCH_PATHS
.
f
good idea. no need for modifying Pants source then.
āž• 1
(given I work on Pants, too easy for me to just start editing Pants ā€¦)
šŸ˜… 2
p
Both ways honestly seem great. One thing I noticed is I don't think
pants publish
is actually using the environment variables in my config file. If it were, it would be writing to the log file in
/Users/kyle/kairos/build-system-demo/ecr-cache
. When using the
env -i
approach, I can see logs get streamed in if I
tail -f
the log file. When using pants, no logs arrive
c
Oh..
p
My config snippet is at the top of this thread, is there a formatting issue there or something?
šŸ‘€ 1
c
looks good. Iā€™ll run some testsā€¦
env_vars
seems to be picked up, verified using the docker shims technique..
Copy code
$ PANTS_DOCKER_EXECUTABLE_SEARCH_PATHS='["/Users/x/src/tmp/shims"]' PANTS_DOCKER_ENV_VARS='["foo=bar", "LOGNAME"]' ./pants publish testprojects/src/python/docker:test-example
17:21:58.70 [INFO] Starting: Building docker image test-example:1.2.5
17:21:58.71 [INFO] Canceled: Building docker image test-example:1.2.5
17:21:59.10 [INFO] Starting: Building docker image test-example:1.2.5
17:21:59.12 [INFO] Completed: Building docker image test-example:1.2.5
17:21:59.13 [INFO] Built docker image: test-example:1.2.5
DOCKER SHIM!!
Env:
PWD=/private/tmp/.tmpBzbEdb
foo=bar
SHLVL=1
LOGNAME=x
_=/usr/bin/env
exec: docker push test-example:1.2.5

āœ“ test-example:1.2.5 published.
And in `ā€¦/shims/docker`:
Copy code
#!/bin/bash

echo "DOCKER SHIM!!"
echo "Env:"
/usr/bin/env
echo "exec: docker $@"
p
Huh, your command isn't working for me. It says it can't find the executable (I changed the path to the correct location). I'm running pants in ZSH, could this be some weird quirk of my shell?
c
Huh, no idea. Iā€™m not familiar with zsh. You can put those config options in your pants.toml howeverā€¦ just me that took a shortcut to avoid polluting itā€¦
p
HUh working now. It didn't like the shim being in my repo I guess
šŸ‘ 1
Interestingly, publishing images works if I remove
[docker].tools
entirely from my config
šŸ˜² 2
c
That wasā€¦. counter intuitive. Also, which version of Pants are you using?
p
Copy code
$ ./pants --version
12:17:53.96 [INFO] Initializing scheduler...
12:17:54.28 [INFO] Scheduler initialized.
2.11.0
šŸ‘ 1
Thanks so much everyone for all your help. To tie off this loose end, here is the config that ended up successfully publishing images to ECR for me:
Copy code
[docker]

default_repository = "build-system-demo-pants"

env_vars = [
    "AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY",
    "DOCKER_CONFIG='%(buildroot)s/build_support/docker/config'",
    "AWS_ECR_CACHE_DIR='%(buildroot)s/build_support/docker/ecr-cache'",
]

tools = [
    "sh",
]
I gotcha that I think is very interesting to note is that doing the following does not work:
Copy code
[docker]

default_repository = "build-system-demo-pants"

env_vars = [
    "AWS_ACCESS_KEY_ID",
    "AWS_SECRET_ACCESS_KEY",
    "DOCKER_CONFIG='%(buildroot)s/build_support/docker/config'",
    "AWS_ECR_CACHE_DIR='%(buildroot)s/build_support/docker/ecr-cache'",
]

tools = [
    "sh",
    "docker-credential-ecr-login",
]
So for some reason shimming in the credential helper in the case of ECR breaks things, and docker can't find the credentials it need
Another interesting note. If you set the environment variables
PANTS_DOCKER_EXECUTABLE_SEARCH_PATHS="['/Users/kyle/shims/']"
, then the code will look for all the values in the
[docker].tools
part of the config in
/Users/kyle/shims
. Not sure if this is intended behavior or not, just wanted to point it out for anyone else who is trying to get ECR stood up and/or debug docker commands and stumbles on this thread
šŸ™ 1
c
Thanks Kyle! Regarding the search paths, that is as intended. (could add
"<PATH>"
at the end there, to have it do a wider search, only looking in the shims dir first)
h
Glad it's working! Thank you @curved-television-6568 for helping šŸ’œ To clarify, are there changes we should make to Pants? Like bugs, or have better documentation?
šŸ’œ 1
c
I think we could at least investigate why it didnā€™t work with the docker credential helper listed in
[docker].tools
, but started working without.
h
Cool. What would be helpful there? File a ticket?
c
Yeah, that could be worth while to have a ticket for, to help persist the history and future work.
p
I think a little blurb in the docs about this weird edge case for ECR would be helpful too. I stuck it in
[docker].tools
because the docs (which have the google credential helper as an example) have it in there. Maybe explaining the edge case for ECR, or recommending to start with it empty and only add things in as needed, would be a helpful addition?
f
Would you want to submit a PR (as docs are in-repo now)?
docs/markdown/Docker/
p
totally I'll throw that on my to-do list
My approach that worked locally doesn't work in CI. I've given up on the credential helpers and am employing a workaround that writes the credentials to the config file in CI. Not sure what the deal with ECR credential helper is šŸ¤·
f
I am having the same issue
f
Was there anything in the ecr-login log directory? There should be a file called
ecr-login.log
under the
logs
directory in the ECR cache dir. (As suggested by the source code at https://github.com/awslabs/amazon-ecr-credential-helper/blob/main/ecr-login/config/log.go.)
p
If you run it through pants, iirc it doesn't write to that file. I'm not sure why, other than some vague notions about pants stripping out environment variables that the credential help relies on for determining where to write logs to
f
I suggest explicitly setting
AWS_ECR_CACHE_DIR
in
[docker].env_vars
. Then the log should go in the
logs
subdirectory of that directory.
or the default should be
~/.ecr/logs
but would require
HOME
to be passed into the execution sandbox so the code can expand the path. https://github.com/awslabs/amazon-ecr-credential-helper/blob/e6f29200ae0450ba6584aee3041e2527e4ce1873/ecr-login/config/cache_dir.go#L22
p
Yeah I thought I tried that with an explicit absolute path and didn't have much luck. I could have been missing something in my config though