In docker it doesn't seem to be possible to not in...
# general
r
In docker it doesn't seem to be possible to not install all libraries in a resolve if I only use a subset ?
Installing lock.txt for the resolve 'default'
I've tried with
./pants --no-python-resolve-all-constraints package monorepo/my_lib:cli
But it is still installing the whole resolve (as per the logs^^) I'm on version
2.10.0rc4
otherwise, to make full use of docker-caching, is it possible to have a step that will install the resolves ?
Copy code
# Be very selective about what we copy for docker caching
COPY pants pants.toml ./
# Bootstrap pants
RUN ./pants --version

# Copy dependency stuff that doesn't often change
COPY BUILD pyproject.toml lock.txt ./
RUN ./pants install-lock --resolve=default # <- something like this

# COPY src code that changes often
COPY monorepo/ ./monorepo/
RUN ./pants --no-python-resolve-all-constraints package monorepo/my_lib:cli
Just updated to the
2.10
docs and I've seen that I was using the incorrect option 😅 (
--no-python-resolve-all-constraints
is only valid when using constraint files) But I didn't find any similar option for resolves and not sure if it exists đŸ€” ?
c
constraints are not proper lockfiles, as they may be incomplete, and thus the options between them differ slightly. I think the closest option for resolves are https://www.pantsbuild.org/v2.10/docs/reference-python#section-run-against-entire-lockfile And from that, it suggests that it does not install the whole resolve when packaging. Could it be from building the resolve repository pex you notice “the whole resolve” being “installed”, after which it “extracts” the relevant requirements as required for subsequent goals.
So if you have a resolve that’s giving you a lot of overhead since you don’t need as much, perhaps adding another smaller resolve for those parts is worth while?
r
I think the closest option for resolves are https://www.pantsbuild.org/v2.10/docs/reference-python#section-run-against-entire-lockfile
I don't think so, from my understanding this option installs the whole resolve and use it whole on the goal (run, test, repl) you are running (vs the default being installing the whole resolve and using only a subset for the goal). > So if you have a resolve that’s giving you a lot of overhead since you don’t need as much, perhaps adding another smaller resolve for those parts is worth while? A separate resolve will also add complexity since we need to duplicate dependencies between resolves and duplicate
python_sources
and
python_tests
everywhere, which can become a much bigger problem than the resolve overhead ^^
c
and duplicate
python_sources
and
python_tests
everywhere
OK, maybe target parametrization will help reduce that complexity. (which will be introduced in 2.11.0) https://www.pantsbuild.org/v2.11/docs/targets#parametrizing-targets
đŸ€© 1
besides that, I’ll let @hundreds-father-404 fill in the details here, as I’ve not yet much experience with the resolve feature of Pants.
r
For info it takes 420 seconds to install the resolve dependencies, and 220s to extract the subset used by the target. Intersted to know to know if there is some way to improve these numbers
❗ 1
h
Wow 220s to extract the subset? That is not expected. fyi @witty-crayon-22786 @happy-kitchen-89482 who have been doing some performance work related to that Yeah I don't recommend a separate dedicated resolve; even with
parametrize
, that is a whole lot of complexity. See the section "respository.pex" on https://github.com/pantsbuild/pants/pull/14740. tl;dr: it would be possible to not do this whole subsetting thing - it will be more code for us to maintain, but we can deal with that if it makes a meaningful impact in your repo. We try to bias towards a good user experience and then work backwards from there. However, you'd need to be using Pex lockfiles from 2.11; requirements.txt-style locks have to first install the whole lock, then subset. Note that Pex lockfiles overall should have better perf, too: https://github.com/pantsbuild/pants/pull/14771
🙏 1
r
Wow 220s to extract the subset? That is not expected. fyi @witty-crayon-22786 @happy-kitchen-89482 who have been doing some performance work related to that
The whole logs if it is helpful (Running on a p2x instance on aws inside docker with python3.8 and pants 2.10.0rc4):
Copy code
415.46s       Installing lock.txt for the resolve `default`
15:17:59.81 [INFO] Completed: Installing lock.txt for the resolve `default`
15:17:59.82 [INFO] Starting: Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0... (302 characters truncated)
15:19:07.05 [INFO] Long running tasks:
  67.23s        Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
15:19:37.08 [INFO] Long running tasks:
  97.26s        Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
15:20:07.12 [INFO] Long running tasks:
  127.30s       Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
15:20:37.14 [INFO] Long running tasks:
  157.32s       Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
15:21:07.17 [INFO] Long running tasks:
  187.35s       Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
15:21:37.21 [INFO] Long running tasks:
  217.39s       Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
15:22:00.12 [INFO] Completed: Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0... (302 characters truncated)
15:22:01.85 [INFO] Wrote dist/farmwisecv.libs.models/cli.pex
w
yea, that’s really surprising. the “extracting” step doesn’t hit the network, and should effectively just be hardlinking a bunch of files. since you’re able to repro fairly easily, would you mind setting
--pex-verbosity=3
to try and get some more timing information from PEX?
👍 1
r
tail of the log with pex-verobse=3, I'll try to generate a log.txt file if not enough
w


Copy code
pex: Zipping PEX file.
pex: Zipping PEX file.: 221244.8ms
that is exactly what it sounds like: https://github.com/pantsbuild/pex/blob/a8c681a8e5beeb703f8516c5a46695c3d2705f4d/pex/pex_builder.py#L739-L752
👀 1
are you able to share the full and subsetted requirements lists on a PEX ticket, or in a DM? being able to reproduce this is probably the next step
r
Here are the full requirements:
the subset-ted requirements are:
Copy code
157.32s       Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11
w
great, thanks!
so, one more thing that would help with the reproduction (sorry!) would be the output of the build with
-ldebug
(but without
--pex-verbosity
), as that would give us the exact PEX invokes
r
Copy code
18:17:13.30 [DEBUG] Completed: pants.backend.python.util_rules.python_sources.prepare_python_sources
18:17:13.30 [DEBUG] Completed: Hit: Local cache lookup: Building local_dists.pex
18:17:13.30 [DEBUG] Completed: Scheduling: Building local_dists.pex
18:17:13.30 [DEBUG] Completed: pants.backend.python.util_rules.pex.build_pex
18:17:13.30 [DEBUG] Completed: pants.backend.python.util_rules.python_sources.strip_python_sources
18:17:13.31 [DEBUG] Completed: Hit: Local cache lookup: Installing lock.txt for the resolve `default`
18:17:13.31 [DEBUG] Completed: Scheduling: Installing lock.txt for the resolve `default`
18:17:13.31 [DEBUG] Completed: pants.backend.python.util_rules.pex.build_pex
18:17:13.31 [DEBUG] Completed: pants.backend.python.util_rules.pex_from_targets.create_pex_from_targets
18:17:13.31 [DEBUG] Running Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11 under semaphore with concurrency id: 5, and concurrency: 8
18:17:15.55 [DEBUG] spawned local process as Some(257851) for Process { argv: ["/home/obendidi/.pyenv/versions/3.10.1/bin/python", "./pex", "--tmpdir", ".tmp", "--jobs", "8", 
    "--python-path", "....", "--output-file", "farmwisecv.libs.models/cli.pex", "--no-emit-warnings", "--manylinux", "manylinux2014", "--venv", "prepend", "--requirements-pex", "local_dists.pex", "--pex-repository", "default_lockfile.pex", "--interpreter-constraint", "CPython==3.8.*", "--entry-point", "farmwisecv.libs.models.cli", "--sources-directory=source_files", "albumentations<2.0.0,>=1.1.0", "boto3==1.20.24", "botocore==1.23.24", "docstring-parser<0.14.0,>=0.13", "jpeg4py<0.2.0,>=0.1.4", "jsonargparse[signatures]<5.0.0,>=4.3.1", "numpy<2.0.0,>=1.22.3", "nvidia-dali-cuda110>=1.11.0", "opencv-python<5.0.0,>=4.5.5", "pytorch-lightning<2.0.0,>=1.5.10", "requests<3.0.0,>=2.27.1", "s3fs==2022.2.0", "setuptools==59.5.0", "torch<2.0.0,>=1.10.2", "torchvision<0.12.0,>=0.11.3", "wandb<0.13.0,>=0.12.11", "--layout", "zipapp"], env: {"CPPFLAGS": "", "LANG": "en_GB.UTF-8", "LDFLAGS": "", "PATH": ".......", "PEX_IGNORE_RCFILES": "true", "PEX_PYTHON_PATH": ".....", "PEX_ROOT": ".cache/pex_root"}, working_directory: None, input_digests: InputDigests { complete: Digest { hash: Fingerprint<e6edfd3aa32fed2307355fbc864ff791a18c7c520acab0906a18a5956ccb2385>, size_bytes: 432 }, nailgun: Digest { hash: Fingerprint<e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855>, size_bytes: 0 }, input_files: Digest { hash: Fingerprint<e6edfd3aa32fed2307355fbc864ff791a18c7c520acab0906a18a5956ccb2385>, size_bytes: 432 }, immutable_inputs: {}, use_nailgun: [] }, output_files: {RelativePath("farmwisecv.libs.models/cli.pex")}, output_directories: {}, timeout: None, execution_slot_variable: None, concurrency_available: 16, description: "Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0.2.0,>=0.1.4, jsonargparse[signatures]<5.0.0,>=4.3.1, numpy<2.0.0,>=1.22.3, nvidia-dali-cuda110>=1.11.0, opencv-python<5.0.0,>=4.5.5, pytorch-lightning<2.0.0,>=1.5.10, requests<3.0.0,>=2.27.1, s3fs==2022.2.0, setuptools==59.5.0, torch<2.0.0,>=1.10.2, torchvision<0.12.0,>=0.11.3, wandb<0.13.0,>=0.12.11", level: Info, append_only_caches: {CacheName("pex_root"): RelativePath(".cache/pex_root")}, jdk_home: None, platform_constraint: Some(Linux_x86_64), cache_scope: Successful }
18:19:51.71 [INFO] Completed: Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.13, jpeg4py<0... (302 characters truncated)
18:19:51.71 [DEBUG] Completed: Scheduling: Extracting 16 requirements to build farmwisecv.libs.models/cli.pex from default_lockfile.pex: albumentations<2.0.0,>=1.1.0, boto3==1.20.24, botocore==1.23.24, docstring-parser<0.14.0,>=0.1... (314 characters truncated)
18:19:51.71 [DEBUG] Completed: pants.backend.python.util_rules.pex.build_pex
18:19:51.71 [DEBUG] Completed: pants.backend.python.goals.package_pex_binary.package_pex_binary
18:19:53.03 [INFO] Wrote dist/farmwisecv.libs.models/cli.pex
18:19:53.03 [DEBUG] Completed: `package` goal
18:19:53.03 [DEBUG] computed 1 nodes in 161.238908 seconds. there are 2326 total nodes.
w
@rapid-crayon-8232: John pointed out on the ticket that this is apparently about the amount of time that might be expected to compress/recompress that much data. but he also pointed out that there is a more efficient “layout” supported for large applications. in this case, you should see whether you can get better performance with the
packed
layout: https://www.pantsbuild.org/docs/reference-pex_binary#codelayoutcode
r
makes sense, the pex is indeed a few gigs big, I'll try out the packed layout. Thanks for the help 😊
h
Stu, I continue to think we should revisit the default
execution_mode
. In the past, John suggeted we make it a required field to encourage users to think about what they want. That's good for the
package
, annoying if you only want it for
run
w
yea, changing the default in one of the upcoming releases could make sense. not sure how to do that without a bunch of disruption, but probably worthwhile.
we could potentially trigger a warning when a built PEX is a particular size, and have people silence the warning by setting it explicitly
?
👍 1