https://pantsbuild.org/ logo
b

broad-processor-92400

02/15/2023, 9:36 PM
My local
lmdb_store
directory is currently at 46GiB, which is a fairly hefty proportion of my full disk. AIUI, pantsd is meant to be doing some GC on it, but, if so, there's a lot of non-garbage. Can I introspect the GC process? Is there a way to force a (harder) GC?
c

curved-television-6568

02/15/2023, 9:38 PM
rm -rf ~/.cache/pants
?
w

witty-crayon-22786

02/15/2023, 9:38 PM
.pants.d/pants.log
should record a GC every 4 hours, iirc
b

broad-processor-92400

02/15/2023, 9:39 PM
Haha, maybe slightly less forciful GC than
rm
😅 I'll have a look at the log, thanks
😁 1
could/should be a
tokio
task at this point
b

broad-processor-92400

02/15/2023, 9:41 PM
I see various lines like
Copy code
08:22:09.39 [INFO] Garbage collecting store. target_size=28,800,000,000
08:22:11.56 [INFO] Done garbage collecting store
I'll increase the log level and see what else comes out of it
w

witty-crayon-22786

02/15/2023, 9:41 PM
that’s the relevant one
b

broad-processor-92400

02/15/2023, 9:42 PM
Ooh, looks like I could potentially also use the separate fs_util tool to force one (and with
ShrinkBehavior::Compact
which sounds more aggressive than
ShrinkBehavior::Fast
🤔 )
w

witty-crayon-22786

02/15/2023, 9:42 PM
yeeep
Compact is not natively supported though… it requires closing/recreating the database
would definitely welcome a patch that figured out how/where to do that safely
b

broad-processor-92400

02/15/2023, 9:44 PM
ah, yeah, I see it copying to separate files
anyways, thanks for the tips, should be enough for me to work out if our repo is just that huge, or if there's something else going
w

witty-crayon-22786

02/15/2023, 9:45 PM
LMDB has served its purpose fairly well, but it’s not my favorite. not being async compatible, not having a compacting GC, occasional corruption, needing to shard, etc
b

bitter-ability-32190

02/15/2023, 9:46 PM
Don't worry, @broad-processor-92400 soon your large files will just exist on disk 😤
🤞 2
w

witty-crayon-22786

02/15/2023, 9:46 PM
true. that will extend the lifetime of LMDB a bit =)
yea, assuming that the reason you haven’t seen things GC’d is in fact the Compact vs Fast distinction
b

bitter-ability-32190

02/15/2023, 9:47 PM
Although the "large" files won't be GC'd I think. So maybe worse for you 😅
w

witty-crayon-22786

02/15/2023, 9:48 PM
they’ll need to be … haven’t added that feedback to the PR yet =x
b

bitter-ability-32190

02/15/2023, 9:51 PM
Ruh roh
f

fast-nail-55400

02/15/2023, 10:01 PM
just run
tmpwatch
on the large file directory assuming atime is updated
w

witty-crayon-22786

02/15/2023, 10:01 PM
yea… should be possible to do it in the exact same loop as the LMDB store… just create timestamps from atimes
b

bitter-ability-32190

02/15/2023, 10:01 PM
I'm more worried about collecting while the file is still symlinked 😳
w

witty-crayon-22786

02/15/2023, 10:02 PM
GC handles that
by not collecting things that are reachable from memory
b

bitter-ability-32190

02/15/2023, 10:02 PM
Yaaaay... I think
b

broad-processor-92400

02/15/2023, 11:40 PM
Getting very distant from my original question but... Are atimes reliable enough for this purpose? I was under the impression some file systems either don't support them, and/or can be configured to be pretty relaxed about updating them
w

witty-crayon-22786

02/15/2023, 11:40 PM
if not, can “touch” to bump the mtime instead
👍 1
b

broad-processor-92400

02/16/2023, 3:24 AM
Even with
-ltrace
, it seems there's no additional logging beyond the
INFO
above, ah well. The fact that https://github.com/pantsbuild/pants/blob/3304f13aecd534f5581b35104ad77bea41809b5d/src/rust/engine/fs/store/src/lib.rs#L1099-L1106 didn't trigger (i.e. the GC successfully reduced the reported size below 28.8GB) would suggest that indeed this might be a fragmentation issue, since 46GB is a little larger than 28.8GB. I'll try a
ShrinkBehavior::Compact
GC to confirm.
yeah, the same GC via
fs_util gc --target-size-bytes 28800000000
resulted in it the directory being the expected size
4 Views