https://pantsbuild.org/ logo
h

hundreds-breakfast-49010

10/22/2020, 5:51 PM
what do people think about the possibility of a
pants prune
goal?
👍🏽 1
h

hundreds-father-404

10/22/2020, 5:51 PM
what does it do?
h

hundreds-breakfast-49010

10/22/2020, 5:51 PM
or
pants clean
or some other goal that tries to manually determine what artifacts on disk pants has left around, and clean them?
h

hundreds-father-404

10/22/2020, 5:52 PM
I’d like it. Even though pantsd is supposed to do garbage collection, Asher has pointed out it doesn’t work very well. Pants takes a ton of storage if you don’t manually prune, and it’s very non-obvious imo to users that they can/should delete things
Can we prune
lmdb_store
intelligently, like only get rid of unused/stale things?
f

fast-nail-55400

10/22/2020, 5:53 PM
and as prior art, many other tools have a manual prune mode
👍 1
h

hundreds-breakfast-49010

10/22/2020, 5:53 PM
docker for one
h

hundreds-father-404

10/22/2020, 5:53 PM
I think I like the name
prune
more than
clean
f

fast-nail-55400

10/22/2020, 5:53 PM
yup,
docker system prune
and
git gc
of course
h

hundreds-breakfast-49010

10/22/2020, 5:53 PM
I don't know how hard it would be to manually figure out what we can safely deletein lmdb_store
h

hundreds-father-404

10/22/2020, 5:53 PM
Especially if we could have options to completely get rid of everything vs. to only remove stale things, etc
f

fast-nail-55400

10/22/2020, 5:54 PM
you’d need to define some set of “roots”
and then prune anything that is not reachable from those roots
h

hundreds-breakfast-49010

10/22/2020, 5:54 PM
part of the problem is that pants stores its on-disk artifacts in a couple of different places, which I'd like to unify in any case
h

hundreds-father-404

10/22/2020, 5:54 PM
I think the Pantsd garbage collection is supposed to already prune intelligently? But I have no idea how it works. Stu would know
h

hundreds-breakfast-49010

10/22/2020, 5:54 PM
off the top of my head,
.cache/pants
and
<build_root>/.pants.d
, not all of which is actually pantsd-related code anyway
I may be missing some things
j

jolly-midnight-72759

10/22/2020, 6:05 PM
We call our shell script that does this
build-suppoet/nuke
🍄 ☁️
Unfortunately it uses
pants
to find the proper directories to get rid of, so if you run it twice, it rebuilds
pants
chroot before deleting it. 🤣
h

hundreds-breakfast-49010

10/22/2020, 6:06 PM
curious what your shell script does if you can share it @jolly-midnight-72759
j

jolly-midnight-72759

10/22/2020, 6:08 PM
There are better ways to do this I'm sure, but we didn't want parse
pants.toml
until we deployed v2.0.0. If this gets build into
pants
proper, that would be even better.
(oh sharing a code snippet in a thread is not a happy slack place)
f

fast-nail-55400

10/22/2020, 6:10 PM
on calling things “nuke”, the toolchain internal repo CI config has a shell function called
nuke_if_too_big
to prune Pants caches
j

jolly-midnight-72759

10/22/2020, 6:12 PM
four letters, a hard sound, and a cromulent definition makes it apropos
h

hundreds-breakfast-49010

10/22/2020, 6:17 PM
clearly this is something that multiple people are running into and building their own tools to solve
h

hundreds-father-404

10/22/2020, 6:17 PM
We document
nuke_if_too_big
in our “Using Pants in CI” page too
h

hundreds-breakfast-49010

10/22/2020, 6:18 PM
the log stuff I'm about to add is going to make the files we store in
.pants.d/run-tracker
bigger, which is why I'm thinking about this
👍 1
p

polite-garden-50641

10/22/2020, 6:19 PM
I think asking the user to run that is bad UX... it should just happen automatically, based on some sane defaults. just like git gc.
it is fine to have a way to manually run it... but it should happen automatically by default.
👍 1
h

hundreds-breakfast-49010

10/22/2020, 6:21 PM
agreed. but it makes sense to build a manual feature before we try to make it automatic, and in the meantime it can solve problems users are actually encountering
🤖 1
👍🏽 1
p

polite-garden-50641

10/22/2020, 6:24 PM
I agree. but I don't think it is a priority right now. pants already writes a lot of stuff without cleaning it up. so this would so a directory with a bunch of log files won't be anything new or the biggest space hog.
h

hundreds-breakfast-49010

10/22/2020, 6:25 PM
yeah, I don't think the log stuff I'm about to add will significantly increase the amount of on-disk space pants uses. but it will increase it a little bit
and in general it is already a problem that pants' disk usage can get huge, we've all run into this
👍 1
that is a problem worht solving or at least mitigating to make the post 2.0.0 user experience better
h

hundreds-father-404

10/22/2020, 6:28 PM
and in general it is already a problem that pants’ disk usage can get huge, we’ve all run into this
+1. And I don’t think it’s a good experience for a user to realize how hungry Pants is, and not having a mechanism to clean it without knowing it’s safe to
rm -rf
. +1 that automatic is better, but manual is a fine first step.
p

polite-garden-50641

10/22/2020, 6:29 PM
I doubt those logs can get any where near the tens of gigs .... unless debug logging is turned on by default (which it shoudn't)
h

hundreds-breakfast-49010

10/22/2020, 6:30 PM
logs are not the problem, they just caused me to think about the actual problem, which is everything we store in
.cache/pants/
particularly lmdb and rust artifacts
👍 1