Hmm, so my ORAS plugin would have to consider remo...
# plugins
f
Hmm, so my ORAS plugin would have to consider remote state in the registry when figuring out what to push, which means that my rules are going to be pretty impure. Is this a fundamental problem or would it be OK? I am guessing that the Get(ProcessResult) calls in the rules would invalidate the cache and make sure that I don’t reuse stale remote state?
Plugin is coming along btw, I am now able to push stuff to an ORAS registry using Pants! Thanks again @gorgeous-winter-99296 and @curved-television-6568 for your excellent help!
I have to say, my first impression of the Pants architecture was pretty intimidating, but now that my head is adjusting to it, I think it’s really cool, and fun to program with!
1
g
Hmm. I'd say "depends". Rule runs are cached (inside a
pantsd
session) based on the inputs. So you get a tree of dependencies... same request and same other-inputs (e.g. packages in your case?) = whole run is elided and cached response is returned. Process runs are cached (even across sessions) based on the Process parameters. You can configure caching a bit more exactly both on the process (never, in-session, across-session) with
ProcessCacheScope
. For rules, you can subclass
EngineAwareReturnType
for your return type and implement various functions on that to control caching behavior.
The way I've dealt with any caching issues so far is pretending they can't exist unless I know exactly how to deal with it up front. Then when I have an actual issue, I handle it when I know more about the failure modes. 😛
In practice, this means that the only place I've changed caching setup while writing the code was for my secrets plugin, because secrets should never be cached if it can be avoided.
f
What I am thinking about in this case is how to handle ORAS attachments. ORAS lets you attach any number of files to a given artifact. The problem is that the operation is not idempotent (i.e repeated attach calls will create new attachments rather than overwrite). I thought that I might idempotize the call by checking for the existence of an attachment with the same media type before either overwriting or failing out, but that would mean making a call to the registry based on f.ex a ManifestReference with
registry
+
tag
+
media_type
.
A hacky way to do this would be to make the request for information about a manifest be based on a timestamp or something, that would cause it to never be cached, right?
(this is a fun way to make FP people tear their hair out, just tell them that any function is pure as long as it accepts a timestamp)
🤣 1
g
Yeah. Though if you make the output uncacheable that has the same effect and is less hacky, I'd say.
f
Totally! And that is done by subclassing mentioned above, I’ll have a look!
So all of this does not sound like a bad idea / bad match for Pants?
Oh
pants.engine.engine_aware
is an interesting module.
c
I would prob go with a rule that fetches the current state in ORAS as input to the publish rule, so it knows what needs to be done. And if the state changes, the publish rule is invalidated.
🙏 1
f
Agreed! But the input to that rule (i.e the reference to the Artifact as name + tag) would not change even though the information might be stale?
So it would have to return something non-cacheable?
c
I’ve no idea how oras works, are there no additional metadata you can provide for the attachments?
or perhaps a checksum on the content or so?
f
I am also sort-of learning it as I go here. An attachment can be uploaded multiple times and attached to the same repository under different sha-sums even if the actual content (layer) and declared media-type is the same. This was a bit puzzling to me as I expected it to be idempotent, but I guess the sha-sum of the attachment includes the timestamp in the config for each attachment.
g
Isn't that still a TOCTOU problem? If we are trying to push artifact foo.pex with digest 0xDEADBEEF, it might've existed last time we tried publishing but have been deleted since.
f
Yes all of this would be much easier if we knew that noone could tamper with the registry between each publish.
I think I’ll put attachments off a bit and just use Artifacts. Pushing an artifact is an idempotent operations so I won’t have to worry as much about remote state.
Sorry about maybe being a bit inconsistent with terms here, ORAS terminology is still a bit confusing to me. Attachments are also just artifacts, but with a subject field in their configuration.
You can check out the updated (functional) plugin here if you are interested. I refactored so that it looks more like other Pantsbuild plugins: https://github.com/Peder2911/pants-oras-plugin