Hi I m trying to build two docker images with pants where on Pants #general

Hi. I'm trying to build two docker images with pan...

fierce-greece-10087

11/28/2024, 2:10 PM

Hi. I'm trying to build two docker images with pants, where one depends on the other. I have a simplified version of the BUILD file that reproduces the issue I have:

Copy code

docker_image(
    name = "01-base",
    instructions = [
        "FROM python:3.11.8-slim-bookworm"
    ]
)

docker_image(
    name = "02-app",
    dependencies = [
        ":01-base"
    ],
    instructions = [
        "FROM 01-base:latest",
        "RUN echo test"
    ]
)

When I delete the

01-base

image and rerun the

pants package

for

02-app

, the build fails because there is no

01-base

image. This happens even though both images are in fact built.

Copy code

15:02:04.51 [INFO] Completed: Building docker image 01-base:latest
15:02:04.91 [INFO] Completed: Building docker image 02-app:latest

If I just rerun

pants package

again, the build succeeds. It seems to me that

pants

doesn't wait for the image to be actually available or something. Is this a known issue? Does my approach even make sense or is what I'm doing not "the pants way" of thinking?

happy-kitchen-89482

11/28/2024, 2:46 PM

Hmm, I don’t reproduce. If I delete

01-base

(and the dep to it) and rebuild then the build works because

01-base

exists in my local docker images. But if I

docker image rm

it and then rerun then things fail, as expected, because the base image isn’t found by docker.

happy-kitchen-89482

11/28/2024, 2:46 PM

What error do you see when the build fails?

happy-kitchen-89482

11/28/2024, 2:48 PM

Oh wait, I think I am misunderstanding. When you say “When I delete the

01-base

image” do you mean you

docker image rm

it from your local docker state, or do you mean that you delete its target from your BUILD file?

☝️ 1

happy-kitchen-89482

11/28/2024, 2:49 PM

I guess you mean the first thing, because now I do reproduce that behavior.

happy-kitchen-89482

11/28/2024, 2:49 PM

I would call this a bug

happy-kitchen-89482

11/28/2024, 2:50 PM

Docker is a tricky case for Pants, because it involves persistent state (the local image registry) that lives outside of Pants’s control

happy-kitchen-89482

11/28/2024, 2:51 PM

@curved-television-6568 is this known behavior?

fierce-greece-10087

11/28/2024, 2:52 PM

Exactly, I meant removing the image via

docker image rm

happy-kitchen-89482

11/28/2024, 2:57 PM

Does the local registry make newly built images available for pull asynchronously, I wonder, so there’s a race condition

🤷 1

happy-kitchen-89482

11/28/2024, 3:04 PM

Ah no, sorry, this is more obvious than that

happy-kitchen-89482

11/28/2024, 3:04 PM

For example, if you run Pants with

--no-pantsd

you won’t reproduce this

happy-kitchen-89482

11/28/2024, 3:05 PM

this is because Pants has cached in memory that the first image was produced , and nothing it its state contradicts that fact

happy-kitchen-89482

11/28/2024, 3:05 PM

Sorry, this should have been obvious to me, slow morning

happy-kitchen-89482

11/28/2024, 3:07 PM

So Pants is short-circuiting all the rules that produce the base image, because they have already run successfully

happy-kitchen-89482

11/28/2024, 3:08 PM

But then that failed run does repopulate the local registry, I need to see how that happens

happy-kitchen-89482

11/28/2024, 3:08 PM

I need to take a look inside the docker backend code

fierce-greece-10087

11/28/2024, 3:14 PM

I can confirm that it works with

--no-pantsd

. Thanks! And yes, it seems that pants repopulates the registry after the error. I would expect that if it doesn't check the cache, it would also fail on all further attempts.

happy-kitchen-89482

11/28/2024, 6:58 PM

Interesting, it looks like it does run both processes every time, but it runs them sequentially in dependency order when it works and concurrently when it doesn’t (which explains why it fails that time but the image is available next time)

happy-kitchen-89482

11/28/2024, 6:58 PM

I need to dive into the code to see why this is

happy-kitchen-89482

11/28/2024, 8:41 PM

Unless it is obvious to @curved-television-6568…

curved-television-6568

11/28/2024, 10:26 PM

My guess you already said what’s going on. Pants doesn’t know the image was removed, and serves up the previous result from cache. This is quick so may look concurrent. But this should be true also for the second image.. so we’re missing a clue to the puzzle. It may be pants does kick off the process before having the cache result (hit or miss) come back and first then kill the process in case of a hit. This to not slow down processes to needlessly wait for cache misses (only wasting a few cpu cycles in case of a cache hit.) So, my hunch is that docker is fast enough to fail the process for the second image before being killed by the cache hit. Maybe. 🤔

happy-kitchen-89482

11/29/2024, 12:27 PM

I think the mystery is why it does rerun the process to rebuild the base image, but concurrently

happy-kitchen-89482

11/29/2024, 12:28 PM

If you run with

-ldebug

you see the process actually running

happy-kitchen-89482

11/29/2024, 12:28 PM

Oh right, dammit, speculation…

happy-kitchen-89482

11/29/2024, 12:29 PM

You’re right, this is speculation “saving us” the second time around

👍 1

happy-kitchen-89482

11/30/2024, 10:20 PM

Uuuugh

happy-kitchen-89482

11/30/2024, 10:21 PM

So probably the robust fix is for Pants to have its own internal local Docker repo, instead of using the default one

happy-kitchen-89482

11/30/2024, 10:21 PM

Does that make sense?

happy-kitchen-89482

11/30/2024, 10:21 PM

That way an outside force can’t mess with the internal state

curved-television-6568

12/01/2024, 7:06 PM

That could get problematic quick given the lack of cache mgmt features if you have many large images.. I think another option could be to have a pre-build step in the docker backend that checks if a given image already exists or not.

happy-kitchen-89482

12/01/2024, 10:44 PM

How would it know if it is the right image though?

fierce-greece-10087

12/02/2024, 9:49 AM

Not sure if this changes anything, but I also tried deleting both the

01-base

and

02-app

images + deleting pants cache. When I try running

pants package

02-app

, I get the same error, about

01-base

not existing. Even though it was built. So long story short, the same experiment, just with an empty cache.

fierce-greece-10087

12/02/2024, 10:08 AM

But it still works with the

--no-pantsd

flag.

happy-kitchen-89482

12/02/2024, 1:10 PM

Yes, there are two caches in play here, the on-disk cache and the in-memory cache in pantsd.

happy-kitchen-89482

12/02/2024, 1:11 PM

So this is consistent with what you’re seeing. The in-memory cache is the spoilsport here.

curved-television-6568

12/04/2024, 8:07 AM

How would it know if it is the right image though?

By adding a "pants" label to the built image with the cache key.

8 Views

Open in Slack

Previous Next