does anyone have any experience with submitting an...
# development
h
does anyone have any experience with submitting and getting merged changes to the bazel remote build protobufs? context is https://github.com/pantsbuild/pants/pull/9198#pullrequestreview-371104078
I'm inclined to think that getting a notion of platform defined in the bazel protobufs and accepted by those maintainers (google, right?) is a lot of overhead
but I'm curious if anyone has tried to do this before
a
there’s a weekly remote execution api working group where these sorts of extensions are proposed
👍 1
trying to find it now idk what to search + headache but i’ll try to find it later
sorry
we’ve proposed changes before based on implementation decisions in scoot
a
We’ve made some changes in the past, people are generally pretty reasonable
There’s already a notion of platform standardised somewhere, so I think it would “just” be a matter of re-using it…
h
I think that would amount to a one-line addition here: https://github.com/bazelbuild/remote-apis/blob/1b16ed76965afa8bb229ba22b90745c546416443/build/bazel/remote/execution/v2/remote_execution.proto#L792 , but I also think that just wrapping this ourselves (at least initially) will take less time than any kind of cycle of getting a patch merged upstream, even if they're prompt about it
a
Sounds like doing both in parallel would be pretty reasonable 🙂
That said, we should maybe think about versioning, if we’re going to change the on-disk format…
h
hm, that's a good point. have we ever changed the on-disk format before?
I don't know a whole lot about the internals of lmdb
a
We haven’t, as far as I know
In LMDB, within a file, databases have names, so we could switch the name to be the name of the format
☝️ 1
a
i like that idea
a
In the
ShardedLmdb
constructor, you’re looking at the
Some("content")
and
Some("leases")
arguments
👍 2
Those I think are database names
h
ah yeah we're using a specific git checkout of the rust lmdb binidngs. is tehre a reason for that?
a
before the rust ecosystem is full of mission critical high quality open source crates, someone has to make the crates mission critical, and some projects have unclear maintenance or it wasn’t clear whether we’d need to make further changes down the line so we didn’t try to close the loop yet. i think it’s probably correct to call it tech debt but there’s no immediate reason to fix that yet
h
so, hm, if I just changed that name, then once the change hit master, pants would ignore the database name that currently existed and always try to create a new one
so we'd invalidate any existing db and nothing would clean it up
but I don't think that would break anything?
a
as in, is that what the code does right now, or are you asking whether that is what would happen?
h
I think that's what would happen if I did this naively
or, hm, what if I created a new
ShardedLmdb
object on
CommandRunner
, just for platforms?
a
there are also recurrent concerns in #general and from users within twitter that our lmdb usage doesn’t gc file descriptors as much as it could and we hit against limits during normal use. if we’re considering cleaning up other parts of the impl i’d also consider taking a look at that
creating a new ShardedLmdb on a CommandRunner doesn’t sound bad actually but i would usually expect to add the table and then access it through the
store
crate instead of making a new one in the process_execution crate itself
h
Store
is meant to store file contents it looks like. is it adaptable to store the new
Platform
type?
right now when we construct the
<http://cache.rs|cache.rs>
CommandRunner
we pass in a
ShardedLmdb
to the
new
method
fyi @aloof-angle-91616 trying to upgrade the
lmdb
create used in
sharded_lmdb
to the latest version of https://crates.io/crates/lmdb results in some API changes around cloning the
env
a
hm ok
and thank you for mentioning we just pass in a sharded db to the new method, that should be fine to do here as well, forget what i said
API changes that you’ve figured out or that are causing compile errors?
h
specifically https://github.com/pantsbuild/pants/blob/master/src/rust/engine/sharded_lmdb/src/lib.rs#L255 <- there's no longer an
EnvironmentCopyFlags
type
and i'm not 100% sure what this code is meant to do to begin with
I think I'll just create another lmdb instance called "processes" or something, and passit in in
<http://context.rs|context.rs>
I'm a little concenred that this isn't the cleanest way to do this, and we're only doing it because we don't have a good lmdb schema upgrade path
a
i remember the sharded lmdb code was difficult to understand a lot, especially the boolean lease parameter
i don’t understand the concern you’re expressing
h
is it a problem to create arbitrarily many instances of
ShardedLmdb
?
performance-wise or otherwise?
it looks like every instance of ShardedLmdb maps to a new directory in
~/.cache/pants/lmdb_store
I'm not sure how much overhead this implies
a
we are already not cleaning up file descriptors and people are seeing issues during normal pants operation as a result
if you’re adding a new feature, i generally don’t worry about performance until there’s a need identified to make it faster
h
hm, I just realized - there's no way around handling this as a schema upgrade
a
if you’re saying we’re considering not creating a new db because of possible performance concerns, i don’t think that’s an issue and i think the separation of logic by creating a new db is what we would want
h
if we're going to start tracking Platform in lmdb, what do we do for entries stored before this commit, that just don't have this information?
a
i’m extremely partial to just killing every entry from older schemas
h
if this change necessarily breaks every existing pants lmdb store out there, then there's no reason I think not to just store the new
Platform
parameter in the existing
ShardedLmdb
I wonder what the user experience for that is gonna look like though
store
and
load
in
<http://cache.rs|cache.rs>
are just gonna fail sometimes if they read an older entry or can't properly deserialize one from bytes
a
does it make sense to have a platform parameter in every shardedlmdb instance?
which is why making a new shardedlmdb sounds very reasonable
h
well in every shardedlmdb instance that currently stores a bazel
ExecuteResponse
hm, I guess I could write the
lookup
function there to try to deserialize the new version of the type - with a
Platform
on it - from the raw
Bytes
we store in lmdb, and just return None if it can't, same as if there was just no cache entry for this yet
so then the user experience would be, having to re-run every old EPR that was made before they upgraded to the version of pants with this change
but I think that might happen anyway?
if you upgrade pants in general doesn't that invalidate lmdb?
a
Some unstructured thoughts:
1. The reason we’re on a forked version of lmdb is because the maintainer has been ignoring pull requests for about two years. We’re on exactly master + https://github.com/danburkert/lmdb-rs/pull/43
2. No, we don’t invalidate things on pants releases - one of the nice things about the process execution API is that it doesn’t matter what version of pants you used, if it’s running the same process, we’ll hit the same cache.
3. Deleting the old LMDB and just using a new one with a new schema would be reasonable. But we haven’t built any mechanism to do this, and sometimes people go backwards in time in terms of pants releases, as well as forwards. The nearest equivalent would be to pick a new name for the file and ignore the old one, or to pick a new database name. We can leave the deleting to users, but also, the process execution lmdb should be trivially small for everyone right now, as it’s only really used for cloc (except by us pants developers)
4. It occurs to me that we actually may want platform in the cache key rather than value. Or both. I’m not sure whether Platform is already mixed into the cache key, in terms of a platform property. But if I want to ask for “a pex built on Linux” I can’t really ask for that if it’s not in the key…
h
@average-vr-56795 if we're going to start introducing some kind of notion of schema version, it might be good to do that in a single commit that also renames the lmdb filename, so we never have to deal with trying to read incompatible data from the pre-schema-version version of the store
a
One of the joys of protobuf’s design is that if we’re using a protobuf as a wrapper, that is a form of schema versioning, as long as we’re happy with its constraints…
But if we’d like another mechanism for versioning a schema, I agree that rolling it into this PR sounds good 🙂
h
yeah, I don't think it's a good idea to always count on protobuf versioning; and in any case I don't think there's any way to implement the Platform change we need without some kind of schema-breaking change
even if we use our own protobuf for this, we'll still break things once
a
Yeah 🙂 Sounds good
h
did that suggestion of creating a global constant integer
SCHEMA_VERSION
in the
sharded_lmdb
code and hashing that integer into something that can be XOR'd with the 32 bytes of the existing
Fingerprint
seem reasonable?
a
I’d be tempted to do it as a prefix byte to the 32 bytes, rather than an xor - makes it easier for us to remove stale entries. But the general approach sounds reasonable, yeah 🙂
h
hm, that's easier to implement too
is there any lmdb-related reason why we can't just add one or two bytes, have a 33 or 34 byte key?
a
No reason at all 🙂
Treat LMDB as just a mapping from arbitrary byte-array to arbitrary byte-array :)
h
awesome