Have folks dealt with `413 Payload Too Large` in `bazel remo Pants #general

Have folks dealt with `413 Payload Too Large` in `...

silly-queen-7197

09/20/2023, 5:20 PM

Have folks dealt with

413 Payload Too Large

bazel-remote

? I'm running

pants --no-pantsd --keep-sandboxes=always package item-rank:cli-deps

. The error in context is

Copy code

17:16:24.93 [INFO] Wrote dist/item-rank/cli-deps.pex
17:16:25.28 [WARN] Failed to write to remote cache (1 occurrences so far): Internal: "protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 413 Payload Too Large"

Is the problem that

dist/item-rank/cli-deps.pex

is too large? It appears to be

177M

if I run

du -sh

. On the other hand I've configured

BAZEL_REMOTE_MAX_BLOB_SIZE

on the

bazel-remote

side to 1 GB.

fast-nail-55400

09/20/2023, 5:49 PM

Copy code

protocol error: received message with invalid compression flag

That message is interesting. I am curious as to where this compression idea is coming from in bazel-remote. Pants does not support the REAPI compressed blobs mode. So the notion of compression would need to come from elsewhere.

fast-nail-55400

09/20/2023, 5:52 PM

Or there is something else going on.

silly-queen-7197

09/20/2023, 5:52 PM

In https://github.com/buchgr/bazel-remote,

--storage-mode

ztsd

by default. Would I want to set that to

uncrompessed

fast-nail-55400

09/20/2023, 5:52 PM

Yes you should set it to uncompressed. Pants cannot deal with compressed blobs.

🙏 1

fast-nail-55400

09/20/2023, 5:54 PM

I assume that bazel-remote is just trying to store the raw compressed blob and pass it as-is over REAPI (versus uncompressing on the fly). I would have expected it to uncompress the blob on the fly though since Pants cannot understand a compressed blob.

silly-queen-7197

09/20/2023, 5:57 PM

I think this did work with

storage-mode

set to

ztsd

at some point in the past. I think this error message might just be misleading

silly-queen-7197

09/20/2023, 5:57 PM

If I use a PVC in my k8s cluster it appears to work. On the otherhand using an ephemeral volume doesn't

fast-nail-55400

09/20/2023, 5:57 PM

REAPI requires the server to support

identity

mode uncompressed blobs so I imagine something else is going on here. https://github.com/bazelbuild/remote-apis/blob/6c32c3b917cc5d3cfee680c03179d7552832bb3f/build/bazel/remote/execution/v2/remote_execution.proto#L1638

fast-nail-55400

09/20/2023, 6:00 PM

And looking at bazel-remote source, it knows how to fallback to uncompressed. Yeah something else must be going on.

fast-nail-55400

09/20/2023, 6:00 PM

Maybe run with

-ldebug --log-show-rust-3rdparty

fast-nail-55400

09/20/2023, 6:01 PM

That will dump the Rust-side gRPC logging.

fast-nail-55400

09/20/2023, 6:01 PM

(I wrote that second option from memory; might need to check

pants help-advanced global

for the syntax)

fast-nail-55400

09/20/2023, 6:10 PM

Do you have any proxies or load balancers in between Pants and bazel-remote?

fast-nail-55400

09/20/2023, 6:11 PM

Finding the source of the 413 status code might help with debugging.

broad-processor-92400

09/20/2023, 7:24 PM

Isn’t

storage-mode

how bazel remote stores things internally, not necessarily how it sends/receives them over the wire? In any case, I think there’s also a chunk size setting on the pants side that may be relevant, to reduce the size of each request. Of course, finding the source of the 413 would be good.

silly-queen-7197

09/20/2023, 7:48 PM

There's a k8s service and nginx ingress in front of the remote cache

silly-queen-7197

09/20/2023, 7:48 PM

I tried

-ldebug --log-show-rust-3rdparty

and it appeared to hang

silly-queen-7197

09/20/2023, 7:54 PM

Okay. Ran it again with those flags and it didn't hang

Copy code

19:53:40.71 [DEBUG] send frame=Data { stream_id: StreamId(33) } 
19:53:40.72 [DEBUG] updating from discover 
19:53:40.75 [DEBUG] updating from discover 
19:53:40.75 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(35), size_increment: 65536 } 
19:53:40.75 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(41), size_increment: 16384 } 
19:53:40.75 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(41), size_increment: 49152 } 
19:53:40.75 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(37), size_increment: 65536 } 
19:53:40.75 [DEBUG] send frame=Headers { stream_id: StreamId(45), flags: (0x4: END_HEADERS) } 
19:53:40.75 [DEBUG] send frame=Headers { stream_id: StreamId(47), flags: (0x4: END_HEADERS) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(35) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(41) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(37) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(45) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(47) } 
19:53:40.75 [DEBUG] send frame=Headers { stream_id: StreamId(49), flags: (0x4: END_HEADERS) } 
19:53:40.75 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(39), size_increment: 65536 } 
19:53:40.75 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(43), size_increment: 65536 } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(39) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(43) } 
19:53:40.75 [DEBUG] send frame=Data { stream_id: StreamId(49) } 
19:53:40.76 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(49), size_increment: 65536 } 
19:53:40.76 [DEBUG] send frame=Data { stream_id: StreamId(49) } 
19:53:40.76 [DEBUG] received frame=Headers { stream_id: StreamId(41), flags: (0x4: END_HEADERS) } 
19:53:40.76 [DEBUG] received frame=Data { stream_id: StreamId(41), flags: (0x1: END_STREAM) } 
19:53:40.76 [DEBUG] received frame=Reset { stream_id: StreamId(41), error_code: NO_ERROR } 
19:53:40.76 [DEBUG] received frame=WindowUpdate { stream_id: StreamId(53), size_increment: 65536 } 
19:53:40.76 [DEBUG] send frame=Data { stream_id: StreamId(53) } 
19:53:40.76 [DEBUG] client request body error: error writing a body to connection: send stream capacity unexpectedly closed 
19:53:40.76 [DEBUG] updating from discover 
19:53:40.76 [DEBUG] service.ready=true processing request 
19:53:40.76 [WARN] Failed to write to remote cache (1 occurrences so far): Internal: "protocol error: received message with invalid compression flag: 60 (valid flags are 0 and 1) while receiving response with status: 413 Payload Too Large"
19:53:40.76 [DEBUG] all session end tasks completed successfully
19:53:40.77 [DEBUG] buffer closing; waking pending tasks

silly-queen-7197

09/20/2023, 7:56 PM

For full transparency if I mount

/data

via

Copy code

volumeMounts:
        - name: bazel-remote-cache-pvc
          mountPath: /data
      volumes:
        - name: bazel-remote-cache-pvc
          persistentVolumeClaim:
            claimName: bazel-remote-cache-pvc

in my bazel-remote deployment (hope folks don't mind the k8s) i don't get the Payload Too Large errors. But with

Copy code

volumeMounts:
        - name: data-volume
          mountPath: /data
      volumes:
      - name: data-volume
        emptyDir: {}

I do* This is* probably a me problem

silly-queen-7197

09/20/2023, 7:58 PM

I appreciate all of the responses though

fast-nail-55400

09/20/2023, 8:12 PM

There's a k8s service and nginx ingress in front of the remote cache

I suggest checking the nginx ingress maximum request size. https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#custom-max-body-size

fast-nail-55400

09/20/2023, 8:12 PM

The default is 1M which is of course way less than 177M.

broad-processor-92400

09/20/2023, 9:17 PM

Ah, and, as evidence in favour of that being relevant, the defaults for options that might influence pants' individual request sizes are both close to/larger than that: • 1MiB for chunks for larger uploads (but presumably there's extra metadata around that so it'll be plus a few bytes that push it over the limit) https://www.pantsbuild.org/docs/reference-global#remote_store_chunk_bytes • 4MiB for batching smaller uploads https://www.pantsbuild.org/docs/reference-global#remote_store_batch_api_size_limit

silly-queen-7197

01/11/2024, 12:06 AM

By the way, NGINX ingress was 100% the problem here. Thanks again

Open in Slack

Previous Next