witty-agent-59418
08/07/2024, 4:45 PMactions/init-pants
), as the total size is 10GB+
A lot of that is down to some heavy dependencies on our part (pytorch
, opencv
, etc), but inspecting the directorys shows a lot of the size comes from pip
and downloads
, both of which seem to be caches themselves.
The pip directory is full of examples like 592M /home/runner/.cache/pants/named_caches/pex_root/pip/24.0/pip_cache/http-v2/5/f/8/1/6
, while downloads is similarly opaque, with a lot of /<SHA> directories, also containing package caches.
My understanding of the named cache is that if the named-caches-hash
changed (eg, we alter a lockfile in our repo), then the old cache becomes invalid and GH cache action will prefix match the stale cache and presumably re-use part of the cache to build the new cache.
But the cache seems to include 3 versions of dependencies, in installed_wheels
, pip
, and downloads
. Suspiciously the sizes of the last two are very similar in our cache. Is there duplication here and can some of these sub caches be removed to reduce the overall cache size?wide-midnight-78598
08/07/2024, 6:35 PMprune
or clean
option for caches - a while back I was trying to figure out, practically speaking, what is prunable from there. In general, pex is probably one of the more resilient tools to cache management
I had a hypothesis (as you've suggested here) that we could probably wipe out a whole swathe of files and still be okay. What I don't recall is how the cache backtracking occurs in pex.
e.g. Does it assume "download" (oh, it's there) -> "install" (oh it's there) -> use
Or is it backtracking on the cache: use (oops, not there) -> "install" (oops, not there) -> "download"wide-midnight-78598
08/07/2024, 6:37 PMwitty-agent-59418
08/07/2024, 9:09 PMdownloads
and pip
at the end of the CI run, which dropped it to ~5.6GB - low enough to persist to the GHA cache 🎉
Nothing obvious has broken, but this isn't exercising all the paths.witty-agent-59418
08/07/2024, 9:11 PMwitty-agent-59418
08/07/2024, 9:38 PMwide-midnight-78598
08/07/2024, 9:46 PMLooks like this is causing a rebuild of some pex's during each run, so something is relying on those pathsIf you have example goals or code or repos that would cause it, that would be useful - because a rebuild might be a bug 🤷
witty-agent-59418
08/08/2024, 3:21 PM