I notice the pants repo has remote caching set up in `pants Pants #development

I notice the pants repo has remote caching set up ...

broad-processor-92400

04/26/2023, 3:15 AM

I notice the pants repo has remote caching set up in

pants.ci.toml

, and it seems to cut test times noticably on

main

, but doesn't help much in a PR. Specifically, retrying a test shard that failed spuriously still ends up rerunning all of the passing tests. On the surface, this circumstance seems very amenable to caching: the code is the same between the runs (I think?). Some of the steps set

PANTS_REMOTE_CACHE_READ=false

etc., but not the test ones. Am I missing something about how remote caching works?

✅ 1

enough-analyst-54434

04/26/2023, 3:35 AM

Did those shards actually use remote caching? Here's an example of one that did not: https://github.com/pantsbuild/pants/actions/runs/4794253791/jobs/8528270309#step:11:215

enough-analyst-54434

04/26/2023, 3:35 AM

Auth failure is a thing now and again FWICT. Maybe more now than again?

enough-analyst-54434

04/26/2023, 3:36 AM

So, still a problem but maybe you've pin-pointed the wrong one.

broad-processor-92400

04/26/2023, 3:37 AM

This one on main seemed to: https://github.com/pantsbuild/pants/actions/runs/4801515869/jobs/8543869725 Qualitatively, it seems like it works quite often on

main

(most of the ones I click into have some sort of remote caching), but never on PRs.

enough-analyst-54434

04/26/2023, 3:38 AM

It definitely works on some PRs, I've certainly hit retry and had shards complete almost instantaneously.

broad-processor-92400

04/26/2023, 3:38 AM

Ah, just above that line there's:

Copy code

06:15:29.74 [WARN] [rule-construct-auth-store] Failed to load Toolchain token from env var 'TOOLCHAIN_AUTH_TOKEN'. Please make sure the env var is set in your environment.

Which'd might suggest that's a secret not available to (some) PRs?

enough-analyst-54434

04/26/2023, 3:38 AM

So I think you've identified flaky remote cache, not lack of it.

enough-analyst-54434

04/26/2023, 3:39 AM

Ah, I'm incorrect: https://github.com/pantsbuild/pants/blob/c4d78a6c9c0d2910dd2a84ed4999e903862d2a35/.github/workflows/test.yaml#L278-L280

broad-processor-92400

04/26/2023, 3:40 AM

Ah, GHA secrets say: https://docs.github.com/en/actions/security-guides/encrypted-secrets#using-encrypted-secrets-in-a-workflow "With the exception of

GITHUB_TOKEN

, secrets are not passed to the runner when a workflow is triggered from a forked repository."

enough-analyst-54434

04/26/2023, 3:40 AM

Yeah - classic security hole there.

broad-processor-92400

04/26/2023, 3:40 AM

(ah, and that behaviour is presumably why that step behaves like that.)

enough-analyst-54434

04/26/2023, 3:41 AM

Well, you could imagine compromises, etc. But you're at the correct meat now. This would require some engineering + vulnerabilty thought, etc. Clearly everyone wants caching all the time.

👍 1

enough-analyst-54434

04/26/2023, 3:42 AM

We already compromise on our s3 bucket IIRC so that PRs can push / pull to / from it.

broad-processor-92400

04/26/2023, 3:42 AM

Ok, question resolved: requires a secret, even to read, and thus no remote-cache usage in PRs. (e.g. avoid cache poisoning attacks) I guess theoretically one could potentially have a read-only token that's specified as a variable, rather than a secret, and at least have cache-read

enough-analyst-54434

04/26/2023, 3:42 AM

Yeah - you could go 18 routes to maybe sortof secure.

👍 1

broad-processor-92400

04/26/2023, 3:43 AM

yeah; lots of options and trade-offs. Thanks for walking through it with me

fast-nail-55400

04/26/2023, 7:08 AM

Two points: 1. Because of security reasons, auth for remote cache is restricted on forked repos. 2. I recall Toolchain had a ~45 minute expiration on the restricted access token issued for a run, thus if the CI build takes longer than that the token will also be denied access after that. (Could be longer but it was a fixed value.)

polite-garden-50641

04/26/2023, 2:33 PM

there is a special mechanism used to obtain an access token for PRs. the Toolchain backend will check who submitted the PR and a few other things. the toolchain backend will issue an access token only to users that are known by the toolchain backend, which in the pants repo's case means that the user is a member for the GH pantsbuild org and has logged into toolchain. Otherwise an access token won't be issued and the remote cache will be disabled for that PR.

👍 1

polite-garden-50641

04/26/2023, 2:35 PM

so for example, if Josh or Andreas (both are members of the pantsbuild GH org and have a toolchain user) submit a PR , the CI jobs will be able to use the remote cache, but if some other user does it, they won't.

enough-analyst-54434

04/26/2023, 3:28 PM

Ah, thanks @polite-garden-50641 - I knew my PR re-runs were often fast. I forgot about that allowlisting setup.

Open in Slack

Previous Next