hi all we just started trying to add pytorch spacey sentence Pants #general

hi all, we just started trying to add pytorch, spa...

victorious-zebra-49010

11/30/2023, 3:10 AM

hi all, we just started trying to add pytorch, spacey, sentence-transformers, and a host of other very large ml/ai-related 3rd party packages to our requirements recently (single resolver). pants has slowed considerably in a number of areas: • pants generate-lockfiles - 4 minutes runtime upon any new requirement added • slower PEX builds overall affecting things like pants test on pre-existing targets unrelated to the ml libs • the pants cache itself has been getting very large ◦ my ~/.cache/pants directory on my laptop is like 15GB ◦ the one in CI is like 12GB • CI is basically taking double the time even on cache hits: ◦ the cache takes ~5 minute to download and extract and ~1 minute to write to again, which ends up happening twice in our pipeline the pyproject.toml:

Copy code

[tool.poetry]
name = "main"
version = "0.1.0"
description = ""

[tool.poetry.dependencies]
python = "^3.11,<3.13"
urllib3 = "^2.0.6"
pandas = "2.1.3"
boto3  = "1.33.2"
loguru = "0.7.2"
psycopg2-binary = "2.9.9"
pydantic = "2.5.2"
s3transfer = "0.8.1"
setuptools = "68.2.2"
tqdm = "4.66.1"
pre-commit = "3.5.0"
python-dotenv = "1.0.0"
sqlalchemy = "2.0.23"
datadog = "0.47.0"
fastapi = "0.104.1"
gunicorn = "21.2.0"
uvicorn = "0.24.0.post1"
starlette = "0.27.0"
typing-extensions = "4.8.0"
furl = "2.1.3"
ddtrace = "2.3.1"
python-json-logger = "2.0.7"
pgvector = "0.2.4"
sentence-transformers = "2.2.2"
spacy-legacy= "3.0.12"
tokenizers= "0.15.0"
transformers= "4.35.2"
tiktoken= "0.5.1"
openai= "0.28.1"
langchain= "0.0.343"
numpy= "1.26.2"
unicodedata2= "15.1.0"
anthropic = "0.7.7"
toolz = "0.11.1"

[tool.poetry.group.dev.dependencies]
pytest = "7.4.3"
pytest-mock = "3.12.0"
gitpython = "3.1.40"
ipdb = "0.13.13"
httpx = "0.25.2"
docker = "6.1.3"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

[tool.isort]
profile = "black"
line_length = 88

[tool.black]
line-length = 88

some file sizes:

Copy code

➜ du -h -d 1 ~/.cache/pants/named_caches/pex_root/
3.7G	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//installed_wheels
 18M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//bootstraps
1.7M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//interpreters
 32K	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//user_code
164K	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//built_wheels
1.9M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//unzipped_pexes
745M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//venvs
4.7M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//pip-20.3.4-patched.pex
4.7M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//pip-23.1.2.pex
2.6G	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//pip_cache
4.7M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//pip-23.0.1.pex
2.6G	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//downloads
433M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//installed_wheel_zips
2.8M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//isolated
1.7M	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root//bootstrap_zips
 10G	/Users/andrea.hutchinson/.cache/pants/named_caches/pex_root/
➜ du -h -d 1 ~/.pex
850M	/Users/andrea.hutchinson/.pex/installed_wheels
 13M	/Users/andrea.hutchinson/.pex/bootstraps
 24K	/Users/andrea.hutchinson/.pex/interpreters
6.4M	/Users/andrea.hutchinson/.pex/user_code
1.9M	/Users/andrea.hutchinson/.pex/unzipped_pexes
1.9M	/Users/andrea.hutchinson/.pex/isolated
873M	/Users/andrea.hutchinson/.pex
➜ du -sh dist/export/python/virtualenvs/main_resolver
989M	dist/export/python/virtualenvs/main_resolver

any recommendations on what can be done to speed up our workflow? does it make sense to bust caches? i see that https://pantsbuild.slack.com/archives/C046T6T9U/p1686754599977459 feels related

broad-processor-92400

11/30/2023, 3:38 AM

It's not a solution that helps right now, but there's active work going on that'll hopefully reduce the time and/or give more control: • https://github.com/pantsbuild/pants/issues/20227 • https://github.com/pantsbuild/pex/issues/2292 @gorgeous-winter-99296 is involved with a lot of it, so might have some tips for the short-term

gorgeous-winter-99296

11/30/2023, 10:57 AM

Unfortunately nothing to add in this case! Definitely playing around with pex layouts etc can help, but they all have different tradeoffs. Using

python_source

for running in-repo as opposed to pex_binary is generally a huge time-saver. If you use

<http://download.pytorch.org|download.pytorch.org>

for locks; consider setting up an index (or add to mine): https://tgolsson.github.io/torch-index/ -- the torch index is served directly from s3 and is very slow on a miss.

victorious-zebra-49010

11/30/2023, 8:53 PM

sounds good y'all, i'll keep tabs on those issues. yeah we don't use our own index

4 Views

Open in Slack

Previous Next