Hi all! Is it normal for packages to be duplicated...
# general
b
Hi all! Is it normal for packages to be duplicated inside
.cache/pants/named_caches/pex_root/installed_wheels/
? It seems I have 5 identical copies of torch, each of size 3.5G (
du -hs
tells me the size of the
installed_wheels
directory is 21G). When installing the same requirements in a virtual environment, the total size of the venv is 7G, with torch being 3.5G. This is after nuking the
.cache/pants/
directory and running
pants test ::
If I remove the 4 tests which transitively import torch, then nuke the cache and run
pants test ::
again, I only have one torch directory under
.cache/pants/named_caches/pex_root/installed_wheels/
(but other packages are duplicated). So my current hypothesis is that pants creates a copy of torch (or other library) for every test importing them transitively, but I'm guessing it's a problem with my configuration.
e
@boundless-ambulance-11161 one suggestion here is to eschew screen shots and paste text - screen shots are not friendly for support. So, you don't have multiple copies per-se, you have 1 copy and then a bunch of `.xyz`suffixed copies that were not cleaned up. The cleanup not happening is most likely due to kills. Did you suffer OOM-kills at any point?
Also, what version of Pants is this @boundless-ambulance-11161?
b
Thanks for the answer. We're using pants_version = "2.15.0"
We did have OOM-kills in the CI pipeline. They happen during the
Cache@2
task, caching the pants' named cache. I don't think I've had an OOM kill on my local build since I nuked the cache. When should those
.xyz
be cleaned up normally?
e
With no OOMkill they are cleaned in-line in the same
pex
invocation that creates them.
So the problem to fix here is the OOM-kill cause. With OOM-kills gone, you won't have these leftovers. For now, you need to clean them away by hand. The
pex
tool has no
gc
or
clean-cache
command.
b
Thanks, that's very helpful! Could those remaining
.xyz
be due to me modifying some code while
pants test ::
is running (which triggers the creation of the pex)? Because it seems to restart the process. (the OOM could explain the extra files in CI, but this could explain the extra files on my local build)
e
I'm not sure. If Pants uses
kill -9
directly or indirectly when restarting then yes, but this does pretty much require a
kill -9
from some source, OOM-Killer or otherwise.
👍 1
The underlying mechanism is multiple
pex
processes racing to do the 1 time install of the wheel in that un-suffixed chroot dir. Once just 1 process wins, that version is never installed again and all racing processes cleanup their suffixed dirs before exiting. It's only if the racing processes are aborted that the cleanup goes un-done.
Pants is a heavy user of multiple concurrent
pex
processes obviously.
b
If I can bother you on a somewhat related question (decreasing the size of the cache), my understanding is that the files in
downloads
,
pip_cache
,
installed_wheel_zips
and
installed_wheels
correspond to my 3rd party dependencies at different stages of the build process.
Copy code
2.1G	/home/vsts/.cache/pants/named_caches/pex_root/installed_wheel_zips
2.2G	/home/vsts/.cache/pants/named_caches/pex_root/downloads
2.2G	/home/vsts/.cache/pants/named_caches/pex_root/pip_cache
7.9G	/home/vsts/.cache/pants/named_caches/pex_root/installed_wheels
Do we really need to keep them all in the cache? After all, if I have the installed wheel in
installed_wheels
it doesn't seem I still need to keep the files in
downloads
? And if the installed wheels is invalidated because I need a new version of a library, that should also invalidate the corresponding directory in
downloads
. Can I safely remove some of those directories manually to decrease the size of my cache? Or should they normally be removed automatically, but were not because of the OOM error?
e
Those directories are not removed. You would have to remove them manually, but that's not super-wise either unless you know there are no
pex
processes running while you issue the `rm`s. Taking 1 example though, downloads store sdists and those might be used to build wheels for new interpreters. So you know you use just 1 interpreter and the downloaded sdist can be chucked, but Pex doesn't know that. You might run it again in 5 minutes with
--python python3.11
for example.
And etc, etc. Obviously a cache is a space tradeoff for speed and so there you go. You want tradeoff knobs that choose differently than the default.
b
Ok, that's a lot of food for thought. I think that was all the help I needed on this issue, thanks a lot for your help John!