Hi team, a question on pex/pants cache. We would l...
# general
w
Hi team, a question on pex/pants cache. We would like to use cache in our jenkins CI pipeline. Currently, we config
PEX_ROOT
,
PANT_LOCAL_STORE_DIR
and
PANTS_NAMED_CACHES_DIR
in our CI. With those, we can see the pex cache are used within builds across a single branch, i.e, after the first builds, the following builds on that branch will re-use the pex cache. The question is, given we config the
PEX_ROOT
,
PANT_LOCAL_STORE_DIR
and
PANTS_NAMED_CACHES_DIR
to be the same place across all build, won’t the cache be shared across all builds and branches? For example, if one build on a branch resolves the ‘python-default’ from build-support/lockfile.txt, can it be shared?
h
Yes, the cache is shareable across different Git branches etc. You can convince yourself of that by locally running a test, and then creating a new branch (w/ no changes) and re-run the same test
By the way, I would not expect you to need to set
PEX_ROOT
. Pants sets that up for you based on
PANTS_NAMED_CACHES_DIR
w
oh thanks for that. Btw, in CI, I am wondering shall we use pants demon or not?
h
Are you running
./pants
multiple times? If so, it usually is a good idea to improve performance. All the daemon does is keeps the results from prior runs memoized for subsequent runs
w
Yes. We do use ./pants multiple times so I guess I will keep the daemon.
👍 1
You can convince yourself of that by locally running a test, and then creating a new branch (w/ no changes) and re-run the same test
I am wondering if the location of the files would matter? For example, in local, if I creating another branch (w/o changes) in a different place, will the cache be used? Because in CI, branches are placed in different places. @hundreds-father-404
h
What do you mean different places? The only thing that matters is the relative path from your build root (where
./pants
is located), and that file content is the same
w
For example, in Jenkins, the branch will be put to /tmp/jenkins_tmp/workspace/branch-1, /tmp/jenkins_tmp/workspace/branch-2 and etc.
In this case, would cache be used across branch-1 and branch-2?
h
Got it. That should be fine. @ancient-vegetable-10556 actually just gave a talk about Pycon how this works! tl;dr is that we use relative paths and "snapshot" the files so we have a lightweight way to reference them, e.g. insert the files into a sandbox (tempory directory)
w
I just did the following experiment locally, Background: 1. a git commit with all configs/files (e.g. pants.toml, BUILD files) required by pants. Experiment 1: 1. check out the commit to /tmp 2. download the pants binary into /tmp 3. run ./pants check :: (it builds all pex cache such mypy.pex/etc and resolves ‘python-default’ dependencies) Experiment 2: 1. check out the same commit to /tmp1 2. download the pants binary into /tmp1 3. run ./pants check :: , it still builds all pex cache such mypy.pex/etc and resolves ‘python-default’ dependencies again. Question: in this case, can experiment2 use the cache from experiment1? or I am doing something wrong?
h
hmmm it should be reusing the cache - it is global so that it can be shared
To double check, when you re-run the same command in tmp1, is cache used? how about rerunning in tmp2?
w
yes. if i re-run the same command in tmp/tmp1, the cache is re-used.
So i cannot make the cache shared with different locations, somehow each location is using a ‘local’ cache.
I am trying to us ldebug to see which cache are used in both places. But the log only mentions cache is hit but it does not tell where is the cache.
h
are you setting
PANT_LOCAL_STORE_DIR
and
PANTS_NAMED_CACHES_DIR
or using the default?
w
using the default. I dont set them locally.
At the meantime, I am wondering what are used to construct the key to the global cache?
@hundreds-father-404 I think I am able to figure out the issue. locally: 1. it seems it would take some time (I am not sure how long) for the cache being picked up In CI: Perhaps 1, and the main issue in CI is that the different branch are using different python interpreter available in different path (even they are the same). In this case, Pants will rebuild due to different python interpreter.