I can't seem to get the caching for my docker imag...
# general
l
I can't seem to get the caching for my docker images to work – they're always rebuilt no matter if the input files have changed or not, which makes it unfeasible to run
pants publish ::
in CI. I've tried making sure everything is static – no build args, and fixing
FROM
to a static sha. Am I missing something or is it a limitation of the current docker plugin that it always needs to rebuild?
s
How does your CI cache docker builds?
h
@silly-queen-7197 asks an excellent question - CI typically has no memory, each run happens in a fresh container. So unless you use a remote cache, or a self-hosted runner with a shared local cache, you wouldn't expect caching in CI. Have you set this up somehow?
s
Pants v2.19 added a really neat feature that lets you use the Docker Buildkit features
cache_from
/
cache_to
(https://www.pantsbuild.org/2.19/reference/targets/docker_image#cache_from) to cache against a docker image in a registry. Maybe that will help if your CI doesn't have a way to cache docker builds between runs.
l
I get the same behaviour locally though, if I run
Copy code
pants package ::
two times in a row the container is rebuilt both times
I don't if I run
Copy code
pants build ::
though since nothing has changed 😕
But to answer your question, I am using remote caching via
pantsbuild/actions/init-pants@v6-scie-pants
in Github actions
g
@lively-school-24147 I have this working in my monorepo. To be clear, I followed the instructions here. The result is that if source files haven't changed AND the binary-deps docker container is already in docker's local cache, it won't attempt to rebuild it. If I delete the final docker image that copies everything in, when I re-run pants package I don't see either binary-deps or binary-srcs docker container re-building.
l
Strange. I'm building a Go app though, so my docker file is extremely simple. I wonder if makes a difference to switch to
instructions
vs a dockerfile
Two consecutive runs (locally):
Copy code
work/nats-fwd → pants package src/docker/nats-fwd:nats-fwd
01:23:27.35 [INFO] Completed: Building docker image us-central1-docker.pkg.dev/repo/nats-fwd:latest
01:23:27.36 [INFO] Wrote dist/src.docker.nats-fwd/nats-fwd.docker-info.json
Built docker image: us-central1-docker.pkg.dev/repo/nats-fwd:latest
Docker image ID: <unknown>
work/nats-fwd → pants package src/docker/nats-fwd:nats-fwd
01:23:36.70 [INFO] Completed: Building docker image us-central1-docker.pkg.dev/reponats-fwd:latest
01:23:36.70 [INFO] Wrote dist/src.docker.nats-fwd/nats-fwd.docker-info.json
Built docker image: us-central1-docker.pkg.dev/repo/nats-fwd:latest
Docker image ID: <unknown>
and this is my BUILD file:
Copy code
docker_image(
    name="nats-fwd",
    repository="repo/nats-fwd",
    image_tags=["latest"],
    registries=["us-central1-docker.pkg.dev"],
    instructions= [
        "FROM <http://gcr.io/distroless/static-debian11|gcr.io/distroless/static-debian11>",
        "COPY src.go.nats-fwd.cmd.nats-fwd/bin /app/bin",
        "CMD [\"/app/bin\"]"
    ]
)
s
Pants will execute the docker build each time you call package
g
@silly-queen-7197 I think that's right, but I see
Canceled: Building docker image ...
mine finishes super fast.
actually, shoot. I'm confused know 🤣
s
I think the Canceled is due to remote caching
g
I think @silly-queen-7197 is right. I was overindexing on how fast it was building in comparison to before because of the cached PEX files.
I do think the cache_from/to is the solution here.
l
Hm. Is there a way to avoid that? Whenever a new docker image gets built that will basically trigger a deploy. Which is the opposite of what I want in a monorepo setup 😅
s
Do you trigger deploys based on when an image is built or pushed to a registry?
l
I’m not sure that helps, what I’m trying to avoid is getting a new sha sum for the image
@silly-queen-7197 kinda sorta. It proceeds to the next step (which is a flux plugin I just wrote) that will output flux manifests as an oci image and that’s what gets deployed
s
The sha shouldn't change if the inputs haven't changed* and you're* using a static base image, right*?
l
They probably won’t. But I’d still be looking at rebuilding potentially hundreds of images unnecessarily
And even with cache from that’s pretty slow, no?
But disregarding cache_from — my entire impression of Pants was that it only rebuilds what’s necessary, and this directly contradicts that. 🙂 So is my impression wrong or does the Docker build behave differently than others parts of pants due to reasons?
s
👀 1
If I understand the project correctly its closer to
rulse_oci
in bazel which allows Pants to manage / cache the artifacts rather than delegate out commands to Docker
l
Oooh, very much so. Thanks a lot for sharing!
Yeah exactly rules_oci more or less does exactly what I need but I’d like to be able to avoid Bazel
I’ll look into giving it a spin!
s
Tom, the maintainer of https://github.com/tgolsson/pants-backends, is active on Pants slack. Its definitely an exciting package
l
Definitely! I think that would work with my Go stuff out of the box. JVM stuff might be a bit trickier but I guess I’ll cross that bridge when I get to it
h
@lively-school-24147 Pants generally does not rerun unnecessary work. BUT - one exception is when the work has external side effects (these may be the "reasons" you alluded to).
For example, what should happen if you use
pants package
to create a docker image, and then delete that image from the local docker state
In that case you'd want
pants package
to re-run
So currently it does so conservatively
it might be possible to add a step that verifies that the image exists and has the expected SHA (which Pants would need to have cached when it built it)
But you can see how this is an issue, that doesn't exist when packaging a .pex or a python dist or a JAR file
g
Thanks for plugging my plugin! It has very similar behavior for side-effects, albeit a bit better caching since it's all in Pants. We use it + changed-since filtering at work to minimize rebuilds/republishing. Doesn't work 100 % still but better than docker, especially if you avoid scripted build commands. It's not ready yet but I've got some initial work for equivalents of rules_apko and the Debian one I can't remember to offset that need and allow pure layer-based builds. I also treat "accidental" reuploads as free since all you're doing is pushing the manifest but no actual blobs.
🙌 2
l
@gorgeous-winter-99296 I managed to make some progress trying to get it to run with
oci_image_build
instead. I'm on a mac so I vendored the latest stuff from your repo. Now I'm stuck on the gnu-tar dependency, while I have it on my path it's not called
gtar
nor does it live in any of the paths searched by the plugin (I'm using nix so it's basically just part of my
$PATH
). I'll see if I can work around that somehow, and if I can I can upstream the fix.
But I wonder if one way would be to do feature detection, like looking for
--sort
in
[g]tar --help
g
I think gtar is the common name if its from homebrew, but I'll openly admit I'm not a Mac person and only did a push to improve support because someone asked last week. So I'm very open to contributions or issues for those who know the platform better than me.
👍 1
l
I think you’re spot on! I think the problem stems from me not using homebrew 🙂
Ok, I think the best cross platform way to do it might be this:
Copy code
find $DIR -print0 \
| sort -z \
| tar -cf - \
      --format=posix \
      --numeric-owner \
      --no-recursion \
      --null \
      --files-from - \
| gzip -n > $OUT.tar.gz
for gzipp'ed tarballs and
Copy code
find $DIR -print0 \
| sort -z \
| tar -cf $OUT.tar \
      --format=posix \
      --numeric-owner \
      --no-recursion \
      --null \
      --files-from - \
for regular ones. And then adding
--mtime
and
--pax-option
only on Linux. I'm not super familiar with pants plugins, but if it's possible to pipe and redirect output in a
Process
I think it should be pretty straight forward! Otherwise it's probably easiest to wrap it in a bash script and call that instead (somehow, I guess I'd just create a file with
CreateDigest
?) I'll be going on vacation tomorrow but might have some time to look into it, we'll see!
@gorgeous-winter-99296 I switched approach and added homebrew to my nix setup instead so I can get
gtar
and
gsed
in
/usr/bin
more easily, so now I can build images 🎉 Now I'm trying to figure out how to setup auth when pushing the image(s) – but I can't really seem to find any conf to do so. 🙂 Do you have any pointers? 🙏
g
Use regular auth flows outside pants and it should get picked up. I've heard reports of issues with azure, but I use docker and gcloud auth regularly without issue.
l
Ah, so install skopeo locally first and setup auth? Thanks, will try that! 🙏
g
Yeah, or any other such tool.. they all write the same config file on login.
🙏 1
l
@gorgeous-winter-99296 can confirm that everything works great on Mac, thank you so much 🙏 would be amazing with a point release with the Mac fixes!
Currently I’ve vendored all your plugins
g
Thanks for the feedback! I'll take a peek at that tomorrow. Are you using all plugins or just the OCI one?
l
Just the OCI one ❤️
Friendly ping, no stress! 🙂
g
❤️ 2
🙇 1
🙌 2
f
@gorgeous-winter-99296 using the exact same setup for a JVM image, it almost works, but it tries to run the jar as an executable. Is there any way to configure it to use
java
as the entrypoint and put the input in
-jar <input>
or would it make more sense to add another type of target?
I didn't find any way for configurations to make that happen
g
You should be able to explicitly set both entrypoint and args on the target.
f
Oh! how can I do that, you mean on the package target? I didn't find any options for it on the oci targets
g
On the
oci_image_build
you can pass
args
ad
entrypoint
.
f
Oh, that's perfect then! I didn't find it in the docs so that's why I just assumed it wouldn't be there 🙂
g
Yeah I realized when you asked, they're manually written and I tend to forget about updating them when I make releases. I would rely on the Pants cli help commands, they're guaranteed to match the version of the plugin you use.
f
Got it, good pointer, thanks!