quaint-telephone-89068
11/06/2022, 4:20 AMfind
and xargs
, so this doesn't include sandbox setup time. Also, the batches ran sequentially, so the wall time of the current one-file-per-process strategy in Pants is faster than that 99 seconds, thanks to parallel execution. But it's still much slower and more CPU-expensive than it needs to be.
E.g., we know we have users with larger repos for whom full-repo dep inference (e.g., in a call to ./pants peek
) takes several minutes.
So it seems reasonably clear that batching the dependency parsing is a big perf win.
(I'm referring to Python here, it seems likely that similar benefits would obtain for JVM at least).
This discussion is to, er, discuss some options for doing so.
pantsbuild/pants