Hello :wave: I have a monorepo setup with `pants`...
# general
r
Hello 👋 I have a monorepo setup with
pants
on v2.16 (yes, I know it's deprecated), but we're encountering a problem where running
lint
for all targets writes >100GB to disk. That's crazy! It's not only annoying to clean-up our local
pants
cache so frequently, but this recently hit the limit of disk allocated for our CI tasks. What can I do to manage this? Thank you
b
Sorry for the trouble! Can you provide a bit more info: • Is it 100GB of pants cache? Can you confirm which parts specifically are large, e.g.
du -h -d2 ~/.cache/pants
might highlight whether it's process caches (
lmdb_store
) or named/append-only caches for a specific tool (
named_caches/...
)? • Which linters are you using?
r
I completely wiped my local pants cache, killed
pantsd
and only ran
pants lint ::
and this was written to the cache
Copy code
14G	.cache/pants/named_caches/pex_root
 14G	.cache/pants/named_caches
4.4M	.cache/pants/lmdb_store/cache
1.1G	.cache/pants/lmdb_store/files
5.3M	.cache/pants/lmdb_store/directories
1.2G	.cache/pants/lmdb_store
 15G	.cache/pants
15G is obviously much less than 100G (that number came from that command running on a docker container in CI), but I'll see if I can figure out how to replicate what happened in CI
b
Huh 14G of pex_root cache is a lot. Do you use large libraries like pytorch and/or tensorflow?
r
Yes
c
In CI I regularly see see pex_root sizes like
14G
or
30G
. And I don't have a great handle on the variability
r
When this runs in our CI, it writes ~100GB to disk
Copy code
6.6G	dist/cache/pants/lmdb_store/files
2.3M	dist/cache/pants/lmdb_store/directories
2.3M	dist/cache/pants/lmdb_store/cache
6.6G	dist/cache/pants/lmdb_store
88G	dist/cache/pants/named_caches/pex_root
88G	dist/cache/pants/named_caches
94G	dist/cache/pants
94G	dist/cache
88G of
pex_root
! Is there a reason why more might be written to
pex_root
on some machines (a linux docker image running in CI) vs others (developer's mac osx arm64)?
b
(very delayed, thanks for waiting) I think linux compiled code/wheels are generally bigger, e.g. https://pypi.org/project/numpy/#files •
numpy-2.1.2-cp312-cp312-macosx_14_0_x86_64.whl
6.6 MB •
numpy-2.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
16.0 MB I think pex_root can end up with the .whl files and then various unzipped copies of them too, for all the different subsets of dependencies