it looks like when we produce `Tree`s for the `out...
# development
w
it looks like when we produce `Tree`s for the
output_directories
of `ActionResult`s, we store them in the
files
database (we have
files
and
directories
labeled databases currently). this mostly works for local access (since they are looked up in the same place), but it fails to upload in
ensure_remote_has_recursive
for a remote store, because we don’t recognize the blob as a
Tree
(it’s labeled as a file)
it seems like there are a few options here, but that regardless of implementation, we will likely be using more `Tree`s and fewer `Directory`s over time (and maybe eventually deprecating `Directory`s entirely…?)
so, two options i see:
1. replacing the
directories
database with a
protos
database, which would store an outer envelope proto containing oneof
Directory
,
Tree
,
Command
,
Action
(all the protos we store in the
files
database currently) 2. creating a
trees
database in addition to the
directories
and
files
databases, and calling it a day
i started implementing option 1, and i think it likely makes more sense than continuing to grow the number of databases we have. it would also potentially allow
ensure_remote_has_recursive
to recursively upload those other types and remove some manual code around that
c
Sounds like 1 is the correct approach, and 2 the easy one ;)
😅 1
w
cc @fast-nail-55400, @average-vr-56795, @enough-analyst-54434
a
With 1, are you thinking of tagging what the proto type is somehow?
We originally separated them for two reasons: 1. To avoid needing to validate them every time we deserialise them 2. Because different garbage collection characteristics
So I'm curious how we'd achieve 1 - we could validate more often, we could tag they've been validated (either in the LMDB entry or somewhere else)...
w
Yea, I was thinking that the envelope would be
message StoreTypes { oneof field { ... } }
. My thinking was that anytime we store a blob we know its type, so we can preserve that.
Knowing the type is a prereq for any solution though? Can't choose to put it in a hypothetical
trees
database without knowing its type already.
But I do think that continuing to keep the
files
database separate and untagged probably makes sense.
a
Yeah, either oneof, or... IIRC @hundreds-breakfast-49010 added a schema mixin to the key so we could just add a bitset to the key
w
yea, true. key tagging vs value tagging is a bit different: you have to know the type when looking it up.
but honestly, i think that maybe
ensure_remote_has_recursive
doesn’t need to have the API it does currently, where it takes only `Digest`s… i’m pretty sure a caller will already know the types of the digests (because it got them out of an
Action
from a particular field, etc), which would avoid the need to tag them for that purpose, and we’d just tag for validation
and at that point: yea, key tagging would work.