A chicken-and-egg problem with Docker. We should b...
# development
h
A chicken-and-egg problem with Docker. We should be telling Docker what platform is being used, which is important e.g. when running Linux x86 on macOS Arm. But we don't know what Platform a docker image corresponds to, and naively we would need to run a Process in that docker image to introspect it
The dead simple solution is for the user to tell us via a
platform
field on
docker_environment
. We could either make it required, or
linux_x86_64
seems like a reasonable default
I think making it a
required
field is pretty reasonable...it's not very common you're defining new `docker_environment`s fwict The default scares me because it's subtle and confusing when that default is wrong
e
I don't have access ATM but doesn't docker inspect give you enough info?
h
ah thank you thank you. cc @fast-nail-55400 @witty-crayon-22786, that also seems to give us the env vars. rather than needing to run a process that invokes
env
Copy code
"Architecture": "arm64",
        "Variant": "v8",
        "Os": "linux",
w
rather than needing to run a process that invokes
env
yea: but thinking forward to remote execution, @fast-nail-55400 and i discussed not assuming that we have a docker image locally which corresponds to the remote execution environment
h
So, sounds like we maybe want a variant of Tom's idea from meeting today to "get all the bootstrap info in one go". We have an intrinsic that runs
docker inspect
to set up
CompleteEnvironmentVars
and
Platform
, and then
system_binaries.py
can proceed like normal w/o modification
w
so implementations which are docker-agnostic are good.
h
yea: but thinking forward to remote execution, @fast-nail-55400 and i discussed not assuming that we have a docker image locally which corresponds to the remote execution environment
We will know when it's an RE environment thanks to the target type. So perhaps better to solve this the simplest way for Docker, and then we solve RE in the way that makes sense for it, e.g.
env
?
If n was >> 3, then making this generic seems important. But n=3: local env, Docker env, and RE
w
if the intrinsic is necessary for platform extraction, then possibly.
h
hey @witty-crayon-22786, with the
docker inspect
approach, it will look like this:
Copy code
@rule
def current_platform(env_tgt: EnvironmentTarget) -> Platform:
    docker_image = env_tgt.docker_image()
    if docker_image is None:
        return Platform.create_for_localhost()
    # TODO(#16852): Use the engine to compute this
    return Platform.linux_arm64
I don't think we can call a
#[pyfunction]
from
native_engine
because we need to use the
docker::CommandRunner
so that we can properly pull down missing images; a free function wouldn't have a way to access to the current SchedulerSession/CommandRunner I don't think
SessionValues
is relevant, because we can have n Docker images per session So I'm thinking it looks something like Rust defining
DockerImageInformation
and
DockerImageInformationRequest
, and registering an intrinsic rule? Then this rule will do
Get(DockerImageInformation, DockerImageInformationRequest)
I think making it a required field is pretty reasonable...it's not very common you're defining new docker_environments fwict
This approach is not reasonable. I didn't realize that
python:3.9
is Linux Arm64 from my mac, but Linux x86-64 from my dev box
f
I've had some issues with Docker and multi-platform images. The
--platform
argument to
docker pull
can be essential to use. If on M1, you
docker pull
a
python:3.9
image with platform set to x86_64, but then try to run
docker run
without the x86_64 argument, Docker will invoke the container as ARM and not x86_64 and warn about the platform mismatch.
👍 1
Copy code
% docker pull --platform=x86_64 python:3.9
3.9: Pulling from library/python
23858da423a6: Pull complete
326f452ade5c: Pull complete
a42821cd14fb: Pull complete
8471b75885ef: Pull complete
8ffa7aaef404: Pull complete
15132af73342: Pull complete
f4aeaae2f8da: Pull complete
5d8aa591a389: Pull complete
4645a1f8dbad: Pull complete
Digest: sha256:54c4e59cdab21669b1a5417f5b70cb1cf221530f423661d9a8dbee5576160e2a
Status: Downloaded newer image for python:3.9
<http://docker.io/library/python:3.9|docker.io/library/python:3.9>
% docker run python:3.9
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
It's fine if the platform argument match.
Copy code
% docker pull --platform=linux/amd64 python:3.9
3.9: Pulling from library/python
Digest: sha256:54c4e59cdab21669b1a5417f5b70cb1cf221530f423661d9a8dbee5576160e2a
Status: Image is up to date for python:3.9
<http://docker.io/library/python:3.9|docker.io/library/python:3.9>
% docker run --platform=linux/amd64 python:3.9
h
Ah, right. I'm being flip-floppy...but maybe we do want a required field where the user tells us the platform? For multi-platform images, it seems unwise for Pants to guess --
This approach is not reasonable. I didn't realize that python:3.9 is Linux Arm64 from my mac, but Linux x86-64 from my dev box
Nack, because I forgot that on macOS ARM you can specify the platform so it's x86
f
I'll get a PR to set
platform
for Docker command runner later today.
Obviously not sufficient for a solution but it is necessary for one,.
h
Not sufficient how so? This plan seems viable to me: 1. I add a
platform
field to
docker_environment
2. I fix it so
Platform
is always injected into the
Process
, whereas now individual call sites have to set it 3. You consume
Platform
in the Docker Rust code That should result in always consistently using
--platform
whenever Docker is executed
f
I was referring only to (3) in what I said
h
ah, that we need 1 and 2 as well? sg
f
yes ... the future PR I was referring to only handles (3)
👍 1
"Obviously [that PR is] not sufficient for a solution but it is necessary for one."
👍 1
h
w
(have been at dentist and now lunch: will look in a bit)
🦷 1