Hey! Question re caches and ec2 AMI's. We've creat...
# general
a
Hey! Question re caches and ec2 AMI's. We've created an AMI with a "warmed up" pants project - specifically for some target. i.e. we ran
pants run my_target
, and then created the AMI with the relevant caches. However, when we create a new machine and run the same
pants run my_target
it still spends a lot of time on building the requirements. Anyone knows how to improve this? (we have huge requirements, and this stage can take up to 20 mins)
g
What kind of target is affected? Is it running exactly the same code, or changed? It might be worth running with
-ldebug
and seeing the exact Process it runs, and what is captured by that.
a
It is running the exact same target. In our use case - the code might change, but the requirements did not. However, currently it rebuilds the requirements even if we run the exact same target, without code changes. Re
-ldebug
, we know that what takes time is creating the pex with multiple (48 in our case) pip requirements. Anything else we should look for in the logs?
g
With kind of target, I meant if it's a pex_binary or python_source, or something else. They build differently, and have different caching.
python_source
will build dependencies separately from the local code, whereas the
pex_binary
does it in a single step and thus any code change invalidates the whole build. When you run with
-ldebug
it'll say what process it spawns, the arguments, and digests. If those change, it invalidates the cache as well. So if those are constant but the cache isn't used it's a bug, and if they change, it's an expectation mismatch.
a
understood! it's
python_source
. Re cache invalidation - did not notice changes yet. Are there env vars that would also invalidate the cache? e.g. PATH?
g
Hmm, that depends on the goal & target. I think by default it scrubs most of it, but not sure if everything is scrubbed. Definitely some pants settings will invalidate it, and those can also be set via env vars...