Is there a standard practice to share remote cache between C Pants #general

Is there a standard practice to share remote cache...

powerful-scooter-95162

06/25/2024, 6:14 PM

Is there a standard practice to share remote cache between CI and dev machines? I'm getting a lot more cache misses than expected

powerful-scooter-95162

06/25/2024, 6:16 PM

FWIW, I was looking at a build where the "Building requirements for blah.pex" step seemed like it should be cacheable

fast-fireman-39816

06/25/2024, 6:56 PM

Different build systems handle addressing a bit differently. One of the harder aspects of remote cache is artifact resolution in non homogeneous systems. CI usually works well because all CI machines look the same, from environment, operating system, toolchains. Developer laptops are usually a bit more chaotic and require some amount of normalization of environment and backends that populate cache that look the same. Depending on security / company needs, one could allow for developer laptops to be homogeneous between developers and allow them to directly upload to a cache system that doesn't cross the CI/release path, usually with some sort of caching system that is verifying the hash sums of what is uploaded/downloaded to prevent poisoning (if using the google defined reapi). Another aspect where caching can fall short is hermeticity and reproducibility of the build artifacts, sometimes thats buildtool specific configurations, toolchains/generators or general practice of the builds.

powerful-scooter-95162

06/25/2024, 6:58 PM

Yeah, I get that, I feel like "resolve this set of dependency constraints to a set of packages and install them in a venv" should be the same across 2 linux/x64 machines though

powerful-scooter-95162

06/25/2024, 6:58 PM

It makes me think pants is putting too much external state into the cache key

powerful-scooter-95162

06/25/2024, 6:59 PM

not sure if this is configurable somehow, or I'm just out of luck

fast-fireman-39816

06/25/2024, 7:01 PM

fwiw, a few ways I've debugged such things (I don't know the exact way with pants, would be great if someone could chime in on that), is to use the execution log and grpc logs in bazel, that way can see what hashes went into the execution context and what grpc requests are made between two different builds on the same git commit on different machines. If there is divergence in the hashes or keys, then start to narrow down what in the process or flow has the divergence

fast-fireman-39816

06/25/2024, 7:02 PM

From a backend pov, you could have the backend log the requests between the two different builds to see which keys have chanced between the two different build machines

powerful-scooter-95162

06/25/2024, 7:07 PM

I'm definitely getting different fingerprints for the requests to the remote cache, so it's not a backend issue

broad-processor-92400

06/25/2024, 7:43 PM

What platform is ci and what platform is the dev machine? Can you share your pants.toml too?

powerful-scooter-95162

06/25/2024, 7:46 PM

I am realizing that the pants.ci.toml file may be part of the problem in general usage, but I also tried using that file locally and still had cache misses:

Copy code

[GLOBAL]
pants_version = "2.20.0rc1"
build_file_prelude_globs = ["pants-plugins/macros.py"]

backend_packages = [
  "pants.backend.docker",
  "pants.backend.python",
  "pants.backend.python.lint.black",
  "pants.backend.python.typecheck.mypy",
  "pants.backend.shell",
  "pants.backend.shell.lint.shfmt",
  "pants.backend.shell.lint.shellcheck",
  "pants.backend.experimental.java",
  "pants.backend.experimental.kotlin",
  "pants.backend.experimental.python",  # for vcs_version
  "pants.backend.experimental.terraform",
]

pants_ignore.add = ["!gcloud_key.json", "!keys.*", "!anubis/"]

remote_provider = "reapi"
remote_cache_read = true
remote_cache_write = true
remote_store_address = "<grpcs://remote.buildbuddy.io>"
remote_instance_name = "main"
remote_cache_warnings = "always"

[GLOBAL.remote_store_headers]
# BuildBuddy API key to Espresso AI org created by kilogram@. To rotate, create a new org.
x-buildbuddy-api-key = "XXX"

[source]
root_patterns = [
  "/",
  "anubis/spes/src/main/resources",
  "anubis/spes/src/main/java",
  "anubis/spes/src/main/kotlin",
]

[anonymous-telemetry]
enabled = true
repo_id = "d0a16741-97cc-471e-bd04-7e7622b63146"

[python]
interpreter_constraints = ["CPython>=3.11,<3.12"]
pip_version = "latest"
enable_resolves = true

[python-infer]
use_rust_parser = true

[kotlin]
version_for_resolve = "{'jvm-default': '1.9.0'}"

[jvm]
jdk = "temurin:1.11"

[repl]
shell = "ipython"

[python.resolves]
python-default = "3rdparty/python/default.lock"
mypy = "3rdparty/python/mypy.lock"
skypilot = "3rdparty/python/skypilot.lock"

[mypy]
install_from_resolve = "mypy"
requirements =["//3rdparty/python:mypy"]
# args = ["--check-untyped-defs"]
args = ["--no-incremental"]

[python-repos]
indexes = [
  "<https://pypi.org/simple/>",
  "<https://download.pytorch.org/whl/cu118/>",
  "<https://pypi.fury.io/xxx/>",
]

[shellcheck]
args = [
  "-e SC1091",  # Docker images use scripts from base layers, which cannot be found.
]

[subprocess-environment]
env_vars=[
  "AWS_ACCESS_KEY_ID",
  "AWS_SECRET_ACCESS_KEY",
  "AWS_DEFAULT_REGION",
  "ESPRESSO_AT_HEAD",
  "GITHUB_ACTIONS",
  "GITHUB_SHA",
]

[test]
extra_env_vars = ["HOME"]
output = "all"

[pytest]
args = ["-vv", "--no-header", "--log-cli-level=INFO", "-rP"]

powerful-scooter-95162

06/25/2024, 7:47 PM

CI is github actions (presumably linux x64), local is also linux x64

broad-processor-92400

06/25/2024, 7:48 PM

I suspect a major source of differences will be the env vars passed through in

subprocess-environment

and

test

since those will differ per machine, usually.

broad-processor-92400

06/25/2024, 7:48 PM

(And those are part of the cache key)

powerful-scooter-95162

06/25/2024, 7:49 PM

That seems overly cautious for things like resolving dependencies where theoretically pants knows that env vars are used

powerful-scooter-95162

06/25/2024, 7:50 PM

are there any workarounds for that?

powerful-scooter-95162

06/25/2024, 7:52 PM

I'm going to try bumping the pants version too, there was a bug with dep lists not being sorted appropriately and see if that helps too

powerful-scooter-95162

06/25/2024, 8:15 PM

probably as expected, bumping the version to 2.22.0a0 didn't help

broad-processor-92400

06/25/2024, 11:19 PM

Don't know of workarounds in the short term other than reducing the environment variables/only passing them to specific processes where each one is required.

broad-processor-92400

06/25/2024, 11:29 PM

Potentially pants should allow finer grained control over which processes need which env var. It sounds like some of these may only be required for

generate-lockfiles

or are they also used for other goals?

powerful-scooter-95162

06/26/2024, 2:30 AM

the AWS ones are largely for running some integration tests iirc, not even needed to do a build, maybe moving them to the test section would help.

powerful-scooter-95162

06/26/2024, 2:32 AM

all of them might actually fall into that category tbh

powerful-scooter-95162

06/26/2024, 2:36 AM

in the non-CI config we have aws env vars for auth when pulling/pushing docker images to ECR

3 Views

Open in Slack

Previous Next