Hello team :wave: I have a `python_distribution` ...
# general
r
Hello team 👋 I have a
python_distribution
that includes a large number of
resource
targets - about 31,000 YAML files. This package takes a staggering 80 minutes to run
package
. If I remove the
resource
targets, then
package
runs instantly. 31,000 YAML files is a lot of files, but since these are all
resource
targets,
pants
should not actually need to read any of them, it should only need to list them (
pants list
on these targets runs instantly), add them to the
.whl
and list them in the manifest. What is it doing that takes so long?
w
Do they end up in the sandbox? My guess off the cuff would have been materializing them into the sandbox
r
The 31,000 YAML files are included in the final
.whl
so I have no doubt that they do
Confirmed yes, the 31,000 YAML files are copied to the sandbox, but it should not take very long to copy these files. There's a lot of files, but each one is very small. The final
.whl
is 31MB
Is it possible to use
pants
to output the
setup.py
file without building the
.whl
?
It seems like resolving what files need to go into the
.whl
is what takes so long. Once the sandbox has been created, runnign
./__run.sh
runs in 7s
h
Can you share a redacted version of this repo so we can benchmark on it directly?
Or, it should be easy to generate a fake repo that reproduces the problem