Any example of GitHub Action that caches data (<ht...
# general
n
Any example of GitHub Action that caches data (https://www.pantsbuild.org/docs/using-pants-in-ci#directories-to-cache) between runs? The example repository (https://github.com/pantsbuild/example-python/blob/main/.github/workflows/pants.yaml) is not helpful here 🙂
...I'm mainly wondering how key/restore-keys of the cache action should look like.
To speedup CI, I tried at least to test only changed files using
Copy code
./pants --changed-since=origin/master lint
...but for some reasons I'm getting "fatal: bad revision 'origin/master...HEAD'". Any idea what is going wrong here? See https://github.com/robofit/arcor2/pull/264.
Hmm, on localhost
Copy code
./pants --changed-since=origin/master lint
works fine.
Ok, so the issue was that actions/checkout@v2 by default fetches only the last commit. It was necessary to add "fetch-depth: 0". So now it would be great to have even caching! :-)
h
Ohh that’s a good fix. I don’t have any experience with GH actions personally. Would you be interested in submitting a pull request for me he example repo to set fetch depth? That’d be really helpful for other users to know We use the standard open source workflow. Fork the repo, push to your fork, then open a pull request against upstream
h
Yeah,
--changed-since=
really depends on what git state is available.
So good find there.
Re caching between runs, I don't know much about GitHub Actions, but assuming it's similar in principle to other CI systems, then the directory whose contents you want to cache is
~/.cache/pants/
, and particularly
~/.cache/pants/lmdb_store
.
h
@narrow-activity-17405 I added your insight as a tool tip to https://www.pantsbuild.org/docs/using-pants-in-ci#approach-1-only-run-over-changes-files. Could you please spot check if that’s accurate? I think it would also be helpful for us to add some sample config files for the main CI providers to that page. Let us know if were able to get caching working with GitHub actions
n
I think that the tooltip is ok. The thing with Github caching is that it can handle multiple caches - the suitable cache is selected based on a user-specified key. This can be OS/Python version, a hash of some file (e.g. pip lock file), whatever. As far as I understand how Pants works, I think it should be sufficient to set the key to a runner OS + maybe some static value... This way, the key will always match and there will be just one cache (per used OS) that will be used for PR from any branch.
👍 1
👀 1
It seems that the caching works, but sometimes I'm getting these warnings:
Copy code
14:16:54.57 [WARN] /home/runner/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0b2_py36/lib/python3.6/site-packages/pex/third_party/__init__.py:382: ResourceWarning: unclosed file <_io.BufferedReader name='/home/runner/.cache/pants/setup/bootstrap-Linux-x86_64/2.0.0b2_py36/lib/python3.6/site-packages/pex/bootstrap.py'>
  shutil.copyfileobj(resource_stream(module, src_entry), fp)
...for instance here: https://github.com/robofit/arcor2/runs/1199223604?check_suite_focus=true. In this case, the cache was not found (maybe it was nuked before?).
h
Ah, that’s a Pex warning that was recently closed. Sorry about the noise. I believe it’s fixed if you upgrade to 2.0.0b3, which we released over the weekend
n
Ok, fine. I thought that it has something to do with caching as I haven't seen this on my machine.