<#18048 Size split the local store> Issue created ...
# github-notifications
c
#18048 Size split the local store Issue created by stuhood Currently, all blobs (no matter their size) are stored in LMDB. But that is not necessarily optimal for larger blobs. • LMDB's API is not
async
friendly, and so streaming large files into and out of LMDB is challenging (leading to issues like #17065). • In cases where files would be hardlinked via #17878, they must first be materialized back into large files, which represents a temporary second copy of the file. • While capturing large files, we don't have the option of doing so destructively by moving a file into the store. * * * To resolve this, the
local::Store
could be split at a size threshold, such that small files continued to be stored in LMDB, but large files were stored directly as content addressed files (in storage similar to our existing immutable inputs storage). This would involve (in no particular order): • introducing a store directory for large files, with a layout similar/identical to immutable inputs • adding a size threshold in
Store
to switch between small/large strategies. • adjusting
Store::store_file
(and the corresponding
local::Store
method) to take a file path • In the case of an immutable / destructive capture of a large file, the file could be digested in place, and then moved (or copied, if on another filesystem) into the store. • implementing
Store::materialize_file
by directly hardlinking where possible. pantsbuild/pants