Does anyone have any best practices for applying C...
# general
r
Does anyone have any best practices for applying CircleCI’s caching capabilities to the Pants
HOME/.cache/pants/
cache directories? 🧵
CircleCI supports…
Copy code
- save_cache:
    key: v1-cache-key-a
    paths:
      - /some/path
      - /some/other/path

- save_cache:
    key: v1-cache-key-b-{{ checksum: "/path/to/a/file" }}
    paths: 
      - /some/other/path/to/cache
and
Copy code
- restore_cache:
    keys:
      - v1-cache-key-a

- restore_cache:
    keys:
      - v1-cache-key-b- # in the absence of the checksum on the cache key, the most recently cached item matching the start of the key name will be restored
I guess the challenge is that the
checksum
template function works on a single file only. Is there a good single file in each of the three recommended cache directories (
/pants/setup
/pants/named_caches
/pants/lmdb_store
) to ‘anchor’ on?
Or would I be better off running a recursive checksum against each of these directories as a prior step to
save_cache
and passing that through?
p
r
Yep, familiar with that page. Great information on that, looking to get some specific guidance from anyone else that has used pants in CircleCI
e
Well... we did use pants with Circle... and I don't remember the mechanism we used for checksum to be honest (well I do, but it was dated back in early pants days... basically did the recursive checksum of files/dependencies as you suggested, but pants does a better job than that--it's just hard to retrieve its information in a way that helps with Circle). But we were trying to derive cache information based on the source tree more than from pants caches themselves; a recursive checksum might be useful there. The problem we had was that our requirements were so huge (too many scientific libraries; our containers are 3+gigs once you add torch and scikit-learn et al) that the time to transfer the cache was more or less swamping the benefits of having the cache in the first place, but hopefully you'll have better luck!
To wit:
Copy code
❯ du -h --max-depth=1 ~/.cache/pants/named_caches
11G	/home/vputz/.cache/pants/named_caches/pex_root
11G	/home/vputz/.cache/pants/named_caches
11G of cache just takes a while to transfer; when we're already transferring the pip cache, it didn't seem worth it. So what we have isn't great, but I only have so much time to devote to getting a build system set up. We may end up just setting up our own CI system so we can have gigantic resident caches to get around this.