numerous-pharmacist-91083
05/31/2024, 12:54 AMpants generate-lockfiles
and pants check
are very slow if any dependencies have changed. For generate-lockfiles
we get lots of messages like:
17:48:30.91 [INFO] Long running tasks:
60.07s Generate lockfile for python-default
And those runs can take upwards of 10 minutes. For check
the slowness is building a set of dependencies for a mypy
run for each target to be type checked.
I think the slowness is largely because we use PyTorch (things slowed WAAAY down when we added our first PyTorch dependency) and I think I read that to properly resolve versions sometimes whole wheels have to be downloaded which means downloading multiple giant PyTorch wheels.
In any event, regardless of the cause, I'm wondering if there's a way to speed these things up. In particular I know you can use different resolves. If we put all our expensive-to-resolve things like PyTorch into one resolve and everything else into another one I think we can use dependencies from each resolve in the same code (right?) and I'm hoping that then modifying dependencies for the non-PyTorch resolve would be fast again - and most of our dependency changes aren't PyTorch.
Questions:
1. Would the multiple-resolve things work?
2. How do you set it up so a single library or executable can use dependencies from both resolves easily?
3. Is there a better way to speed this kind of thing up?wide-midnight-78598
05/31/2024, 1:04 AMnumerous-pharmacist-91083
05/31/2024, 1:18 AMnumerous-pharmacist-91083
05/31/2024, 1:18 AMwide-midnight-78598
05/31/2024, 1:27 AMwide-midnight-78598
05/31/2024, 1:27 AMnumerous-pharmacist-91083
05/31/2024, 1:28 AMspecifically some parts involving pulling down PyTorchSo in 2.20 in order to find the right version of PyTorch it pulls multiple wheels but in 2.21 it knows how to do the dependency resolution without doing that?? That sounds great!
numerous-pharmacist-91083
05/31/2024, 1:29 AMwide-midnight-78598
05/31/2024, 1:31 AMbroad-processor-92400
05/31/2024, 4:47 AMSo in 2.20 in order to find the right version of PyTorch it pulls multiple wheels but in 2.21 it knows how to do the dependency resolution without doing that?? That sounds great!I think dependency resolution (i.e.
generate-lockfiles
) hasn't changed, but... previously for anything actually using the dependencies like pants check
, it was previously unzipping and re-pingzip the PyTorch wheels, but can now just use the raw wheels as download.numerous-pharmacist-91083
05/31/2024, 10:06 PMgenerate-lockfiles
could be faster but pants check
was indeed very slow and it sounds like this will help. Building pex files was also quite slow - do you know if it'll help with that as well?
I'm in the middle of a large refactor but when done I'll try to bump the pants version and let you know how it goes. Thanks.numerous-pharmacist-91083
06/02/2024, 9:33 PM14:08:41.43 [INFO] Completed: Generate lockfile for isort
14:12:04.87 [INFO] Completed: Generate lockfile for default
14:12:04.94 [INFO] Wrote lockfile for the resolve `default` to 3rdparty/python/default.lock
14:12:04.94 [INFO] Wrote lockfile for the resolve `isort` to 3rdparty/python/isort.lock
Note that it took over 3 minutes to generate the main lockfile.
Building a pex with PyTorch in it took 28 seconds. I think that's better than before by quite a bit but still feels a little bit pokey. Running a test, lint, and check on my repo after upgrading dependencies now takes about 15 minutes and, from the logs, it seems like most of that is building the pex files for the tests, for mypy, etc. That is, logs like
Building 11 requirements for requirements.pex from the 3rdparty/python/default.lock resolve: Pillow, h5py, matplotlib, numpy, pytest, seaborn>0.12.2, setuptools, torch==2.2.1, torchmetrics, torchvisio... (23 characters truncated)
seem to dominate the build time.wide-midnight-78598
06/02/2024, 10:42 PMnumerous-pharmacist-91083
06/03/2024, 12:25 AMnumerous-pharmacist-91083
06/03/2024, 12:25 AMcurved-manchester-66006
07/15/2024, 5:49 PM