what do people think about the possibility of a `p...
# development
h
what do people think about the possibility of a
pants prune
goal?
👍🏽 1
h
what does it do?
h
or
pants clean
or some other goal that tries to manually determine what artifacts on disk pants has left around, and clean them?
h
I’d like it. Even though pantsd is supposed to do garbage collection, Asher has pointed out it doesn’t work very well. Pants takes a ton of storage if you don’t manually prune, and it’s very non-obvious imo to users that they can/should delete things
Can we prune
lmdb_store
intelligently, like only get rid of unused/stale things?
f
and as prior art, many other tools have a manual prune mode
👍 1
h
docker for one
h
I think I like the name
prune
more than
clean
f
yup,
docker system prune
and
git gc
of course
h
I don't know how hard it would be to manually figure out what we can safely deletein lmdb_store
h
Especially if we could have options to completely get rid of everything vs. to only remove stale things, etc
f
you’d need to define some set of “roots”
and then prune anything that is not reachable from those roots
h
part of the problem is that pants stores its on-disk artifacts in a couple of different places, which I'd like to unify in any case
h
I think the Pantsd garbage collection is supposed to already prune intelligently? But I have no idea how it works. Stu would know
h
off the top of my head,
.cache/pants
and
<build_root>/.pants.d
, not all of which is actually pantsd-related code anyway
I may be missing some things
j
We call our shell script that does this
build-suppoet/nuke
🍄 ☁️
Unfortunately it uses
pants
to find the proper directories to get rid of, so if you run it twice, it rebuilds
pants
chroot before deleting it. 🤣
h
curious what your shell script does if you can share it @jolly-midnight-72759
j
There are better ways to do this I'm sure, but we didn't want parse
pants.toml
until we deployed v2.0.0. If this gets build into
pants
proper, that would be even better.
(oh sharing a code snippet in a thread is not a happy slack place)
f
on calling things “nuke”, the toolchain internal repo CI config has a shell function called
nuke_if_too_big
to prune Pants caches
j
four letters, a hard sound, and a cromulent definition makes it apropos
h
clearly this is something that multiple people are running into and building their own tools to solve
h
We document
nuke_if_too_big
in our “Using Pants in CI” page too
h
the log stuff I'm about to add is going to make the files we store in
.pants.d/run-tracker
bigger, which is why I'm thinking about this
👍 1
p
I think asking the user to run that is bad UX... it should just happen automatically, based on some sane defaults. just like git gc.
it is fine to have a way to manually run it... but it should happen automatically by default.
👍 1
h
agreed. but it makes sense to build a manual feature before we try to make it automatic, and in the meantime it can solve problems users are actually encountering
🤖 1
👍🏽 1
p
I agree. but I don't think it is a priority right now. pants already writes a lot of stuff without cleaning it up. so this would so a directory with a bunch of log files won't be anything new or the biggest space hog.
h
yeah, I don't think the log stuff I'm about to add will significantly increase the amount of on-disk space pants uses. but it will increase it a little bit
and in general it is already a problem that pants' disk usage can get huge, we've all run into this
👍 1
that is a problem worht solving or at least mitigating to make the post 2.0.0 user experience better
h
and in general it is already a problem that pants’ disk usage can get huge, we’ve all run into this
+1. And I don’t think it’s a good experience for a user to realize how hungry Pants is, and not having a mechanism to clean it without knowing it’s safe to
rm -rf
. +1 that automatic is better, but manual is a fine first step.
p
I doubt those logs can get any where near the tens of gigs .... unless debug logging is turned on by default (which it shoudn't)
h
logs are not the problem, they just caused me to think about the actual problem, which is everything we store in
.cache/pants/
particularly lmdb and rust artifacts
👍 1