I'm receiving an error when running `docker_image`...
# general
s
I'm receiving an error when running
docker_image
targets:
Copy code
Failed to fire hook: while creating logrus local file hook: user: Current requires cgo or $USER, $HOME set in environment
[2023-04-18T20:58:17.637886000Z][docker-credential-desktop][F] get system info: exec: "sw_vers": executable file not found in $PATH
[goroutine 1 [running, locked to thread]:
[common/pkg/system.init.0()
[       common/pkg/system/os_info.go:32 +0x1bc
#3 ERROR: rpc error: code = Unknown desc = error getting credentials - err: exit status 1, out: ``
#5 [auth] sharing credentials for <http://123456789.dkr.ecr.us-east-1.amazonaws.com|123456789.dkr.ecr.us-east-1.amazonaws.com>
#5 sha256:00484c1754ca42cdfd60e0asdf234asdf8a41a33b1f30d70412c4254
#5 ERROR: error getting credentials - err: exit status 1, out: ``
------
 > [internal] load metadata for <http://123456789.dkr.ecr.us-east-1.amazonaws.com/databricks-base:0.0.2|123456789.dkr.ecr.us-east-1.amazonaws.com/databricks-base:0.0.2>:
------
------
 > [auth] sharing credentials for <http://123456789.dkr.ecr.us-east-1.amazonaws.com|123456789.dkr.ecr.us-east-1.amazonaws.com>:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: rpc error: code = Unknown desc = error getting credentials - err: exit status 1, out: ``
I've ran into this error before and found that adding
Copy code
[docker]
env_vars = [
  "HOME",
  "USER",
  "PATH",
]
fixed the error, but now the error has reappeared despite the fact that I still have the
[docker.env_vars]
section configured in
pants.toml
. any ideas on how I can squash this?
I've confirmed that USER and HOME are set in the environment, and
sw_vers
seems to be available on my PATH, but obviously these are not getting passed into the pants process. I've also tried running it like
Copy code
USER=zach HOME=/Users/zach pants package src/docker/pipeline_image:docker
with no effect
b
Hm, one of our devs gets that too. Unfortunately I don't think we've resolved it yet. What is the host computer? macOS or Linux? x86 or arm?
s
docker push seems to work okay
failing pants publish commands have the error I posted above, and also emit
no basic auth credentials
in the logs. so definitely feels like auth isn't getting correctly passed through
b
hm, we're not pulling or pushing from private repos on our dev machines, so what we're hitting might be different
s
as I go down that rabbithole a bit I'm realizing that may be a side effect of a different problem - trying a very simple
docker_image
target with a Dockerfile that just pulls
python:3.9-slim
and does nothing else is failing with the same issue on
pants package
interestingly I have a couple
docker_image
targets that will still work with
pants package
, still trying to figure out what's different about those
🤔 1
would the correct interpretation of this log message be that environment variables defined in my
pants.toml
file aren't getting passed to the underlying
docker build
call?
Copy code
spawned local process as Some(88403) for Process { argv: ["/usr/local/bin/docker", "build", "--pull=False", "--tag", "<http://12345667.dkr.ecr.us-east-1.amazonaws.com/test_image:latest|12345667.dkr.ecr.us-east-1.amazonaws.com/test_image:latest>", "--file", "src/docker/another/Dockerfile", "."], env: {"__UPSTREAM_IMAGE_IDS": ""}, working_directory: None, input_digests: InputDigests { complete: DirectoryDigest { digest: Digest { hash: Fingerprint<4bdb5dc82ac35992d61254127dd65421235d2e5c32dd9b87d8084144190c76>, size_bytes: 77 }, tree: "Some(..)" }, nailgun: DirectoryDigest { digest: Digest { hash: Fingerprint<e3b0c4429f8fc1c149afbf4c8996fb12441e4649b934ca495991b7852b855>, size_bytes: 0 }, tree: "Some(..)" }, input_files: DirectoryDigest { digest: Digest { hash: Fingerprint<4bdb5dc82ac35992fd67ed341274d123e5d2e5c32dd9b87d8084144190c76>, size_bytes: 77 }, tree: "Some(..)" }, immutable_inputs: {}, use_nailgun: {} }, output_files: {}, output_directories: {}, timeout: None, execution_slot_variable: None, concurrency_available: 0, description: "Building docker image <http://123454560.dkr.ecr.us-east-1.amazonaws.com/test_image:latest|123454560.dkr.ecr.us-east-1.amazonaws.com/test_image:latest>", level: Info, append_only_caches: {}, jdk_home: None, platform: Macos_arm64, cache_scope: PerSession, execution_strategy: Local, remote_cache_speculation_delay: 0ns }
wowzers what a trip. I finally resolved this after messing with it for most of the day and evening yesterday. not sure exactly what the underlying root cause was, but I started somewhat randomly removing things to try to get the repo back into a working state. I removed an environment from my
pants.toml
file which was pointed to an environment definition I had set up a while ago and never really used which was defined like this:
Copy code
local_environment(
  name="local_env",
  fallback_environment="linux_x86_64_py3_9",
  compatible_platforms=["macos_arm64"]
)
which points to this environment as a fallback:
Copy code
docker_environment(
    name="linux_x86_64_py3.9",
    platform="linux_x86_64",
    image="<http://744645366470.dkr.ecr.us-east-1.amazonaws.com/base/python-java:3.9|744645366470.dkr.ecr.us-east-1.amazonaws.com/base/python-java:3.9>",
    python_bootstrap_search_path=["/usr/local/bin"]
)
after I removed the reference to the
local_env
environment from my pants.toml file everything started working again. not sure why this was the problem, my understanding of environments is still a little fuzzy, especially non-docker ones. it's particularly weird because that
local_env
reference has been in my
pants.toml
file for quite some time without any issues until yesterday, not sure what changed