I'm building a pex binary that excludes some depen...
# general
a
I'm building a pex binary that excludes some dependencies, but they're still showing up in the artifact.
Copy code
pex_binary (
    name="predictor",
    inherit_path="fallback",
    layout="zipapp",
    dependencies=[
        "src/dz/recsys/on_lambda.py",
        "3rdparty/python:psycopg2-binary",
        "3rdparty/python#scikit-learn",
        "!!3rdparty/python#boto3",
        "!!3rdparty/python#aws-lambda-powertools",
        "!!3rdparty/python#boto3-stubs",
        "!!3rdparty/python:jmespath",
        "!!3rdparty/python:pydantic",
        "!!3rdparty/python:urllib3",
        "!!3rdparty/python#psycopg2",
    ]
)
Copy code
❯ pants dependencies --transitive src/dz/recsys:predictor
3rdparty/python#cloudpathlib
3rdparty/python#numpy
3rdparty/python#pandas
3rdparty/python#pandas-stubs
3rdparty/python#scikit-learn
3rdparty/python#sentry-sdk
3rdparty/python/pyproject.toml
3rdparty/python:psycopg2-binary
lockfiles/python-default.lock:python-default
src/...
but if I package and inspect the contents of deps, I'm still seeing boto3 in the packaged file
Copy code
❯ du -h -d2
2.4M	./.bootstrap/pex
2.4M	./.bootstrap
1.1M	./.deps/boto3-1.26.103-py3-none-any.whl
76M	./.deps/botocore-1.29.103-py3-none-any.whl
312K	./.deps/certifi-2022.12.7-py3-none-any.whl
184K	./.deps/cloudpathlib-0.10.0-py3-none-any.whl
112K	./.deps/jmespath-1.0.1-py3-none-any.whl
1.3M	./.deps/joblib-1.2.0-py3-none-any.whl
59M	./.deps/numpy-1.22.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
43M	./.deps/pandas-1.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
1.2M	./.deps/pandas_stubs-1.4.3.220710-py3-none-any.whl
7.2M	./.deps/psycopg2_binary-2.9.5-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
524K	./.deps/python_dateutil-2.8.2-py2.py3-none-any.whl
2.7M	./.deps/pytz-2023.3-py2.py3-none-any.whl
336K	./.deps/s3transfer-0.6.0-py3-none-any.whl
32M	./.deps/scikit_learn-1.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
123M	./.deps/scipy-1.7.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
820K	./.deps/sentry_sdk-1.18.0-py2.py3-none-any.whl
60K	./.deps/six-1.16.0-py2.py3-none-any.whl
72K	./.deps/threadpoolctl-3.1.0-py3-none-any.whl
528K	./.deps/urllib3-1.26.15-py2.py3-none-any.whl
346M	./.deps
4.0K	./__pex__
428K	./src/customers
44K	./src/data
84K	./src/dz
8.0K	./src/learning
16K	./src/meta
580K	./src
440M	.
jmespath, boto3, and urllib3 are still present in the pex despite being explicitly excluded. Any idea how I can diagnose this?
e
I'm almost positive excludes only work for 1st party deps and 3rd party direct deps. Pants is 3rd party graph dumb here and farms out to Pex / Pip for that. So I think this needs https://github.com/pantsbuild/pex/issues/2097
a
Makes sense. What's the difficulty rating on that issue? I'm blocked by this, so I could try and make myself useful.
e
The
--ignore-errors
route should not require any new fundamental code. The existing
Distribution
class provides enough to do a graph walk to implement transitive excludes, which is the more involved case. Non-transitive excludes should be a good bit more straightforward. I do think both modes should ship with the feature though.
The big things to keep in mind with Pex: + It supports 2.7 and 3.5-3.11 + No 3rdparty deps allowed (it vendors some but adding more is a semi big deal)
a
Okay... Let me take a look and see if I can get the non-transitive case to work. If so, I'll open a draft PR and we can figure out what's involved for the transitive case. If not, nobody's time is wasted but mine.
e
Sounds great. Thanks for taking a whack at this.
a
np, I have selfish motivations 😂
👍 1