I started looking into <https github com pantsbuild pants is Pants #development

I started looking into <this issue> because it wou...

acoustic-librarian-29560

05/08/2025, 5:52 PM

I started looking into this issue because it would be very convenient for me if Pants could handle setting up

keyring

on its own. However, there's a bit of a thorny bootstrapping problem. My instinct, following the script I have that does this now, is to create a

keyring

subsystem, which is a

PythonToolBase

subsystem that can easily export a pex for

keyring

+ any additional dependencies, like

keyrings.google-artifactregistry-auth

. However, I believe the keyring pex needs to be added to to the

PATH

for pex_cli processes, which

python_tool_base.py

has a dependency on. Furthermore, there's also the issue of needing to call create a pex CLI process to create the keyring pex in the first place. Packaging keyring as an external tool is also problematic because it typically requires other Python dependencies to work with services like Google artifact registry. I also tried only importing the subsystem

if TYPE_CHECKING

but this doesn't work with the rule decorator. I was curious if anyone had any ideas or opinions on how to work around this.

curved-manchester-66006

05/08/2025, 8:24 PM

Not sure if we mentioned this in the ticket, but he plan Tom and I were working with was something like: • Setup the various keyring injection mechanisms you found • Pants does the auth 🙀 with some new subsystem. Said subsystem can have whatever deps it needs like a normal pants subsystem • Pants injects a binary called

keyring

into the sandbox that just echos the results

gorgeous-winter-99296

05/09/2025, 9:11 AM

Not sure if that is me or Dyas, but that is pretty much how my bitwarden secrets work. It decrypts the secret either as a "package" or a "secret" depending on the ruleset used to get it, and then it can be passed in env, or as a file, or whatever the receiving tool desires. F.ex. for PyPi publishing it's used just like the twine secrets (in env) but not sourced from the parent env, instead coming from another rule which invokes Bitwarden.

acoustic-librarian-29560

05/09/2025, 12:39 PM

Pants does the auth 🙀 with some new subsystem. Said subsystem can have whatever deps it needs like a normal pants subsystem

So my view is probably somewhat limited given I've only used this in practice with Google artifact registry, but for artifact registry, there's no need to set credentials, it just relies on your global

gcloud

cli config. Is it different for AWS?

curved-manchester-66006

05/09/2025, 2:20 PM

AWS uses tokens that last for ~8 to connect to Docker/package registries. So you need to do do something like

aws some-command

to refresh your credentials. AWS's docs suggest using that to spit out a new

~/.config/pip/pip.conf

every 8 hours but: • That's kinda gross • Pants studiously avoids using

pip.conf

• Details are fuzzy, but obviously dont' want the token to end up in the lockfile. An alternative approach is to regen .netrc every 8 hours. That is annoying and fiddly and I thought my less technical users would rebel, but doesn't require any bootstrapping shenanegans.

acoustic-librarian-29560

05/09/2025, 3:46 PM

That's frustrating and feels like some AWS specific jankiness!

acoustic-librarian-29560

05/09/2025, 3:49 PM

Could you wrap their keyring provider with a Pants specific one that does the .netrc regen?

curved-manchester-66006

05/09/2025, 3:57 PM

Maybe! For unrelated to Pants reasons our solution became to not use the AWS Package Registry.

😆 1

acoustic-librarian-29560

05/14/2025, 3:32 PM

@curved-manchester-66006 I'm taking another stab at this, did you have any thoughts around how the injected keyring binary would echo those results? I'm wondering if we can export the keyring pex to some cached location and then just have the

keyring

binary be a script that calls said pex.

acoustic-librarian-29560

05/14/2025, 3:33 PM

Another alternative I'm considering is building the

keyring

pex using a bash script that circumvents the existing rules and circular imports.

curved-manchester-66006

05/14/2025, 3:43 PM

I'm taking another stab at this, did you have any thoughts around how the injected keyring

Trying to page the context back in, I think the idea was that in the sandbox there would be a file called

keyring

that was dynamically generated as something like:

Copy code

#!/bin/sh
echo "THE_TOKEN_PANTS_JUST_CALCULATED"

(Or whatever format

pip

expects to get when calling a

keyring

binary)

acoustic-librarian-29560

05/27/2025, 3:38 PM

I opened a draft PR for this demonstrating the cyclical import issue. It seems like the workaround of doing the imports from within the rules themselves doesn't work with the rule-graph parser. Definitely looking for feedback on this as there's major issues with my draft, starting with the simple fact that it does not work 😅. But would love to actually have this feature in pants, it would be super helpful for my org - I'm constantly fielding questions on setting up keyring even though there's a script that does exactly that in our repo 🙃

fast-nail-55400

05/27/2025, 3:55 PM

For some context on the (now closed, unmerged) PRs I had written for the AWS CodeArtifact support operate: • https://github.com/pantsbuild/pants/pull/21852 added a

PexKeyringConfigurationRequest

union which allowed any Pants plugin or backend to supply credentials to Pex invocations. This PR was not AWS-specific. It basically asked plugins for credentials and arranged for a

keyring

script to essentially echo those credentials. ◦ Note: Not quite as simple as an

echo

though. There was some code involved to try and make sure that varying credentials would not invalidate caching for a

Process

(assuming all else was the same). This occurs because varying the shell script's content in the input root leads to invalidation given the cache key is computed on basically everything going into a

Process

. ◦ Also, the PR does not deal how to handle

remote_environment

and

docker_environent

builds. Some design work would be needed here. • https://github.com/pantsbuild/pants/pull/21853 adds AWS CodeArtifact support. It tracks the expiration of the CodeArtifact token and renews it before a run when necessary. It provides an implementation of the

PexKeyringConfigurationRequest

union. GCP support would just entail building on top of the first PR. I imagine it would be much simpler than the AWS PR since it could obtain the token locally.

fast-nail-55400

05/27/2025, 3:59 PM

Also, no issues from me if anybody wants to reuse the code from closed PRs. They were closed for business reasons, not technical ones.

acoustic-librarian-29560

05/28/2025, 2:39 PM

What if we had a called something like

pants-keyring-helper

that had the ability to store credentials in a cache-accessible location (i.e. named volumes for

docker_environment

) and we would call that before calling the pex CLI. Then we'd link that on the

PATH

keyring

and it would be environment aware and know how to find the stored credentials for that environment.

acoustic-librarian-29560

05/28/2025, 5:45 PM

Thinking on this some more, we'd need to do something like this to store the credentials ahead of time

Copy code

from dataclasses import dataclass

from pants.engine.unions import union
from pants.engine.rules import Get


@union
class GenerateCredentialsRequest:
    """Union base class to generate credentials."""


@dataclass(frozen=True)
class CredentialsResult:
    key: str
    contents: bytes


async def generate_credentials(request: GenerateCredentialsRequest) -> None:
    credentials = await Get(CredentialsResult, GenerateCredentialsRequest, request)
    # Call into some (probably Rust?) function to store the credentials outside of a sandbox
    # store_credentials(credentials)

fast-nail-55400

05/28/2025, 5:47 PM

The first PR already does that (from Python). It picks a subdirectory of the repository's

.pants.d

directory and stashes the credentials there.

fast-nail-55400

05/28/2025, 5:48 PM

(That of course doesn't solve the

docker_environment

issue, but it's an example.)

acoustic-librarian-29560

05/28/2025, 5:57 PM

Yeah I was thinking somewhere like

~/.cache/pants/credstore

or something like that for

docker_environment

, which we could have the containers mount. However, that probably wouldn't work for remote execution, which I'm assuming just uses local on the remote machine.

fast-nail-55400

05/28/2025, 6:05 PM

yeah remote execution is a problematic edge case. It's not really a Pants problem either since the REAPI protocol makes any injected configuration value change the cache key.

fast-nail-55400

05/28/2025, 6:06 PM

for GCP, the server operator could potentially authorize the worker machines to access a repository (and put the credentials in a known location). but the code artifact rotating access token is a bit of a problem with AWS.

acoustic-librarian-29560

05/28/2025, 6:22 PM

Given that as you said, if you're using remote execution, you can probably configure the remote machines to authenticate themselves, is it fair to design a solution that solves for local and docker and punts on remote?

fast-nail-55400

05/28/2025, 6:42 PM

I believe so in this case, so long as the documentation calls out the issue with remote execution and that it likely requires server operator cooperation to solve.

fast-nail-55400

05/28/2025, 6:43 PM

Still an open question to me: Is the trade off of putting an auth token into an input root or environment variable worth the resulting increased security risk versus having a fallback to still allow remote execution to work with a credentialed artifact repository?

fast-nail-55400

05/28/2025, 6:45 PM

I can imagine some people will argue for the fallback, but my own opinion is that it hides the security issue from an "ordinary" Pants user using remote execution. Then again, I assume people using remote execution are not the "ordinary" user and hopefully perceive the trade-off.

fast-nail-55400

05/28/2025, 6:46 PM

So not supporting remote execution and failing early highlights the inherent security issue.

fast-nail-55400

05/28/2025, 6:46 PM

But that's my take on a fundamental design choice for this feature.

acoustic-librarian-29560

05/28/2025, 6:47 PM

Environment variable would certainly be easy to implement. You could make it an opt-in feature for remote execution.

fast-nail-55400

05/28/2025, 6:47 PM

So my choice had been to punt, and so of course I agree with your choice to potentially punt. :)

fast-nail-55400

05/28/2025, 6:48 PM

yeah opt-in would at least force the user to tacitly confront the security issue even if it is them just shrugging it off

fast-nail-55400

05/28/2025, 6:48 PM

too bad there isn't a spec for a REAPI secrets service

fast-nail-55400

05/28/2025, 6:49 PM

and also too bad, REAPI doesn't really spec out named caches or persistent workers

acoustic-librarian-29560

05/28/2025, 6:50 PM

My employer does now pay for premium support from Google 🤔😅

fast-nail-55400

05/28/2025, 6:50 PM

so there is some friction between how Pants perceives executing processes remotely and how the REAPI models it.

fast-nail-55400

05/28/2025, 6:51 PM

some sort of opt-in to a fallback seems reasonable.

gorgeous-winter-99296

05/28/2025, 6:51 PM

Does marking something as uncacheable propagate over reapi? It's what I do in my plugin when I decrypt a secret. Haven't thought much about reapi in that regard

gorgeous-winter-99296

05/28/2025, 6:54 PM

(I'd be equally worried about cache busting with SLTs as the security, too.)

fast-nail-55400

05/28/2025, 6:55 PM

> Does marking something as uncacheable propagate over reapi? As it relates to storing a secret beyond the immediate need? Not really for a few reasons: • The

skip_cache_lookup

member of

ExecuteRequest

applies to whether the remote execution system short circuits the execution by returning an already cached

ActionResult

. See the spec. • The choice to use the cache is on the client (i.e., Pants) calling or not calling into the REAPI

GetActionResult

API. If Pants asks for remote execution, then the remote execution system will store the

ActionResult

in the Action Cache (subject to application of the

skip_cache_lookup

bool). If Pants decides something is uncached, it just means Pants decided to not look in the remote cache, asking for remote execution will still result in the CAS having the inputs and the Action Cache will have the resulting

ActionResult

. • The secret will be stored in the

Command

proto (for an environment variable) and uploaded to the CAS or in a file in the input root which itself put in the CAS. So it really depends on the server's GC algorithm as to when those would be purged.

fast-nail-55400

05/28/2025, 6:58 PM

There is a

ResultsCachePolicy

with a "priority" value for how long to cache an

ActionResult

but that is interpreted in a server-specific way.

fast-nail-55400

05/28/2025, 6:59 PM

off the top of my head, I wonder if the Remote Asset API could be coopted to support secrets resolution, but beats me. (The Remote Asset API introduces some indirection which might not invalidate the cache key if the indirect name is stable, but beats me without doing actual research into it.)

fast-nail-55400

05/28/2025, 7:02 PM

(or my idea could entirely miss the point of the Remote Asset API)

fast-nail-55400

05/28/2025, 7:03 PM

any way, just ideas and way beyond the smaller scope of a solution applicable to local and docker environments

acoustic-librarian-29560

05/30/2025, 12:21 PM

I made some updates to my draft PR but I'm pretty sure this won't work as is for

docker_environment

+ it feels very hacky doing filesystem ops this way https://github.com/pantsbuild/pants/pull/22370/files#diff-6486557a2dc478e17102593a7d055c9713ee262a3ab8ed6552399ec878206532R49-R64

8 Views

Open in Slack

Previous Next