Hello! Is it possible to configure `docker_image`...
# general
c
Hello! Is it possible to configure
docker_image
in such a way that the build root / context is at the repository's root? I have the following directory structure:
Copy code
/
  utils 
  project
    sub_dir
      Dockerfile
      BUILD (with docker_image target)
In the Dockerfiles, I want to be able to
COPY  utils utils
and
COPY project project
and execute code in the container from the root direcotry (e.g. python files might
from utils import ...
) I tried to set
dependencies=["project", "utils"]
in
docker_image
, but when trying to
pants package
I keep getting the error stating that project and utils were not found in Docker build context. What am I doing wrong?
c
I think the
context_root
option of the
docker_image
target should do what you want https://www.pantsbuild.org/docs/reference-docker_image#codecontext_rootcode
c
Thought so too, but I'm not able to make it work. It doesn't accept absolute paths, only relative ones, so I tried setting it to
../..
(two levels above the BUILD file) which should do it, but still the same error
r
I think it should be relative to where your pants.toml is
c
that's at the root, so
"."
should work, but it doesn't either
r
Specify the context_root path as files for relative to build root, or as ./files for relative to the BUILD file.
Are making sure this is the case ?
c
So this suggest that
context_root="."
should work, right? But it doesn't.
Neither does
"./"
nor
""
It is also possible that the root is correct but the dependencies are not are the context has no access to them. But I'm unable to make it work.
To make sure the context is correct, I tried moving the docker_image target to a BUILD file in the root (I adjusted the source path accordingly). But still, it fails at the COPY command claiming that both project and utils don't exist.
r
yeah it seems to be the issue about how you can’t copy files/folders from the outside of docker build context.
c
but now the context should be at the root and should contain the entire repo, right?
r
Doesn’t docker determine context based on the Dockerfile itself? I am not sure how pants injects docker context.
c
The Dockerfile is correct, I can build the image without pants by just running docker build without any issues
👍 1
r
Wait you need to put dependencies on the underlying target of projects and utils which would be like
:projects
or
:utils
and since they live above the BUILD file, you need to provide the path relative to the
pants.toml
. So in your case, I would imagine
projects:projects
and
utils:utils
c
Yeah I did that, still doesn't work
😞 1
r
All the docker images I build uses
pex_binary
and I just to copy the pex binary which figures out all the implicit dependencies. I haven’t had to deal with this manual copying. Not that it shouldn’t work.
c
When I run
pants dependencies project/sub_dir/Dockerfile
I get:
Copy code
project/__init__.py
utils/__init__.py@resolve=project
Do you have a suggestion how to use pex binary in my case?
r
Have you looked at pants blog? This is optimizing it but even the first example will help you build a docker with pex https://blog.pantsbuild.org/optimizing-python-docker-deploys-using-pants/
c
I seem to be getting the same problem, the pex file does not contain the utils and project contents. in project's BUILD, I've added the folllowing:
Copy code
pex_binary(
    name="project_binary",
    dependencies=[":project", "utils:utils"],
    resolve="project",
    layout="packed",
    execution_mode="venv",
)

docker_image(
    name="project_image",
    repository="project",
    instructions=[
        "FROM tensorflow/tensorflow:2.9.1-gpu",
        "COPY project/project_binary.pex /bin/project",
        "RUN pip3 install -r project/requirements.txt",
    ]
)
And it fails on not finding the requirements file, which should have been included in the pex as part of the ;project dependency
r
Pex actually is like a virtual env. It will have all the dependencies already installed inside it. So once you have build it successfully, you can run things inside it. But your use case looks bit different. Why not package tensorflow inside the pex itself by adding tensorflow to requirements.txt? What do you want achieve?
c
Regarding TF, it's just easier and faster to start with a base image that already has it and only install a couple of smaller packages from requirements.txt. But that's not the point. What I want to achieve is the following: I just want to build a docker image that would include two directories I have in my root dir (project and utils) such that the container's root as the same as my repository's root, that's just it.
r
yeah sorry for bringing pex in the picture then. I feel like whatever you were trying to do should be the ideal way
b
Your trusty debugging tool is leaking the sandbox and poking around. Look at the
keep-sandboxes
option, it'll leak the sandbox pants used to run in. You'll see what's on disk when your process is running
c
I managed to take a look inside the container by just running it. I have previously noticed that running
pants dependencies
on my dockerfile or pex binary target yields:
Copy code
project/__init__.py
utils/__init__.py@resolve=project
and this turns out to be true, inside the container, in both project and utils directories, there are only init files, no other files are included
I've created a GitHub issue where I hope I describe more precisely what I want to achieve and how I am failing at it: https://github.com/pantsbuild/pants/issues/18379
I've created a minimum reproducible example which is much simpler and the same problem manifests in it. At my repository's root, I've crated a Dockerfile that copies everything into the image:
Copy code
FROM alpine
WORKDIR /app
COPY . .
Also at the root, I have the following BUILD file:
Copy code
docker_image(
    name="root_image",
)
Then, I run:
pants package :root_image
docker run -it root_image:latest
Inside the container:
ls
This only shows a Dockerfile, no other files or directories are included in the image. I try adding different
dependencies
to
docker_image
, but whatever I pass there, no other files or directories are included in the docker image.
On the contrary, when I run
docker build . -f Dockerfile -t root_image:latest
from the repo's root, I obtain an image that has all the repo content in it. I would like to recreate this with Pants.
b
Right. So I think what the missing link here is dependency management. That's one of the cornerstone features of pants. If you want to build an image with the entire repo in it, you're not really getting any benefit from Pants here.
That being said, part of the issue you face IS a pants issue, where the current implementation sometimes ignores dependencies. Primarily because it assumes you'll be dockerizing something packaged, and not loose files. There's already an issue for that (I think)
c
b
Yup that's the dirty workaround 😓