Has cache-read/write-from/to moved into a backend ...
# general
n
Has cache-read/write-from/to moved into a backend or something (for the binary and test goals, python)? Noticed v2 is no longer using the remote cache we had set up. And I no longer see the options in --help.
w
so: a few things here.
1. remote caching has never really worked for python in v1… whoops. that API was very challenging to use correctly.
2. pants v2 supports remote execution and remote caching using a completely separate interface… but it currently only does so _together_… so there is no way to use “just remote caching without remote execution”
n
Not to spend too much time disagreeing, but we at least observe enough variability in build times historically to conclude it does... something.
w
3. local caching should continue to work fine in v2
n
Gotcha on pt 2. That'll be challenging to set up in the constraints of our current system but I hear you.
w
and finally, v2 will become 2.0 at some point this year, and we want to have all of these questions answered before then: interested in your thoughts on (2) in particular
n
Local caching works great. This is for the hermetic CI build. Maybe we'll use a persistent storage or something to point it at. Although not sure how read once write many is gonna work if at all.
w
it’s likely that for (2) we can tweak things so that “remote caching only” is an option. but it will still be the new interface
👍 1
Although not sure how read once write many is gonna work if at all.
what do you mean?
n
There's many Jenkins worker working, so if we do persistent ebs or something I'm not sure how they'd all get the same one.
I guess that's just everything many...
We had it hooked up to a nexus repo. I'm not sure if it opined / cared about the interface. I didn't do that bit so I have limited insight.
But it did reduce build times drastically when rebuilding the universe on minimal changes. So I assume it did something successfully.
w
hm, so when you say that you had a remote cache set up in v1, do you mean an http cache? or the local cache pointed at a shared directory on disk?
n
Trying to use the built in git understanding stuff is also not easy because of our flows. It works on PRs just fine but would be tricky for something that tests master as origin/master is already updated. And we don't squash merge.
So you don't know how many commits ago to test.
http
w
ok, got it.
n
Also we like to output all artifacts per git hash for completeness so build them again, but used the cache, is more desirable than, don't built these targets
w
i’ll open a ticket about the v2 remote-caching-only item
n
Thank you. It's clear from local cache performance that it would improve our average CI global build drastically, an order of magnitude at least.
w
after that we’ll actually need a remote store to point things at… because as i mentioned, its a gRPC API rather than the old HTTP interface
(yay change! … but seriously: it’s better, heh.)
n
I see. Is the cache server a binary that comes out of pantsbuild and just needs a home somewhere, something that has to be built on our end as per some protobuff defs, or something else off the shelf entirely?
w
sorry, had some errands to run
so: this is actually a nascent standard: we use the same remote execution API and storage interface as bazel.
👏 1
there are a few different open source backends, and there are some hosted options.
so some of the backends are Buildbarn, Buildfarm, etc.
and Google’s RBE is a hosted option. Toolchain (the company i work for) is also polishing up a hosted option right now.
but for a sufficiently small setup, a standalone binary would be fairly easy to knock together using the components in pants
@numerous-fall-96475: let me know if one of the followups above sounds helpful. we’d like to make this easy
n
Yep. If you guys give us the flag we'll stand something up to host the cache.
I don't think I have mental buy in enough for centralized building. We're still kind of converting hearts and minds to the monorepo. The majority of the work still occurs in distributed micro repo + pypi + nexus as local mirror + a million requirements and setup.py files. So, baby steps.
w
got it.
yea, sounds good. i’ll follow up on https://github.com/pantsbuild/pants/issues/9719