Hi everyone, I'm facing some issue with pants buil...
# general
a
Hi everyone, I'm facing some issue with pants building the lambda. It was working fine previously but suddenly got stuck and doesn't seem to work from then on. I have tried restarting pantsd, clearing pants cache and restarting my computer as well. Attached are the screenshot, debug logs and pex verbose logs. cc @faint-businessperson-86903 Would appreciate any help in debugging this !
h
Hello! When you say "working previously", did something change like upgrading to a new Pants version or changing the code? Do you remember when this last worked?
a
@hundreds-father-404 nothing changed with the pants version or configuration. it was very random that it suddenly stopped working. And it is only with my setup. my team mates were able to get it working with the same code changes
h
How long does it normally take to build the lambda?
a
It usually takes around 20s or lesser
๐Ÿค” 1
h
with the same code changes
If you go back to before the change, does it work quickly? I see you're on macOS. Are you using an M1?
a
Nothing changed even after reverting the changes. Looks like it's something to do with my setup. I use Intel i7.
h
Hmmm, fishy. I was hoping that you were on an M1 because it often happens that there are not prebuilt wheels and only sdists for macOS ARM, so it takes way longer to build than it does for intel coworkers. But that's not it
e
Can you try this?:
rm -rf dist/
, run the
./pants package ...
command, go get coffee, after 10 minutes or so check the contents of
dist/
? There was a pure display bug in Rust code IIRC that was recently fixed. It may be that package runs but the display looks like it hangs.
๐Ÿ‘ 1
It would be helpful to know what version of Pants you're using as well. I couldn't divine that from the logs.
โž• 1
a
@enough-analyst-54434 I'm using pants version 2.7.0. I will let you know the contents of dist/ shortly
e
Ok. 2.7.0 was an anomaly where we were using a new resolve mechanism that was reverted. I don't think that's at play here, but upgrading to 2.7.2 would get you past that anomaly: https://pypi.org/project/pantsbuild.pants/2.7.2/
And 2.7.1rc1 had a cherry-pick of the commit I referenced above as well, so 2.7.2 is a good bet.
a
I did
rm -rf dist/
, run theย 
./pants package ...
checked after a few mins. The dist directory is still empty ! no such file or directory :dist
I will try 2.7.2 now
Upgraded pants to 2.7.2. Still no luck ! It's interesting that the dist/ directory doesn't get created too and no errors. Just gets stuck at building the lambda.
e
Hrm.
a
I'm guessing it's due to the size of the lambda. I was able to build a smaller one. Any suggestions on how to handle large sized lambda?
e
Well, but you said others using the same repo could build the lambda, right?
๐Ÿ‘ 1
Does your machine have less RAM than others? Has your machine maybe been running longer than others? The only thing I can imagine besides some form of deadlock in Pants that only happens for you and only for this lambda is perhaps page cache thrash if you have most of your memory used up on the machine. That could allow things to proceed without failing, but very slowly.
a
@enough-analyst-54434 Let me check with my team mate on RAM and other details and get back to you.
f
@enough-analyst-54434 Is there an easy way to measure the page cache thrash, using something like inotify-tools or some other method?
a
@enough-analyst-54434 Apparently my RAM is less than that of @faint-businessperson-86903 who was able to build the lambda successfully on his linux machine. Is there a better way to build large sized lambda like this?
This is the process that hangs: /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/Resources/Python.app/Contents/MacOS/Python /Users/suriya/.cache/pants/named_caches/pex_root/unzipped_pexes/592827b562c4aa09c1e9af6b3ed7b797eef1a4dc --python-path /Users/suriya/.poetry/bin/usr/local/opt/libpq/bin/Users/suriya/.pyenv/shims/usr/local/bin/usr/bin/bin/usr/sbin/sbin/usr/local/munki:/Users/suriya/.pulumi/bin --tmpdir .tmp --output-file src.pdl.internal_products.sales_dash/server-lambda.zip --manylinux=manylinux2014 --resolve-local-platforms --no-pypi --index=https://pypi.org/simple/ --resolver-version pip-2020-resolver --platform linux_x86_64-cp-38-cp38 --no-emit-warnings --jobs 6 --manylinux manylinux2014 --sources-directory=source_files PyYAML<6.0.0,>=5.4.1 arn<0.2.0,>=0.1.5 aws-psycopg2<2.0.0,>=1.2.1 boltons<21.0.0,>=20.2.1 dataclasses<0.9.0,>=0.8; python_version < "3.7" django-crispy-forms<2.0.0,>=1.13.0 django-taggit<2.0.0,>=1.1 django<4.0.0,>=3.1.1 elasticsearch==7.12.0 jsonschema<4.0.0,>=3.0 marshmallow-dataclass<9.0.0,>=8.2.0 orjson<4.0.0,>=3.3.1 phonenumbers<9.0.0,>=8.12 prettytable<3.0.0,>=2.4.0 psycopg2-binary<3.0.0,>=2 python-box<6.0.0,>=5.4.1 redis-py-cluster<3.0.0,>=2.0.0 requests<3.0.0,>=2.24 sendgrid<7.0.0,>=6.9.1 sentry-sdk<2.0.0,>=1.5.0 serverless-wsgi<3.0.0,>=2.0.2 simple-salesforce<2.0.0,>=1.10.1 slackclient<3.0.0,>=2.9 smart-open[s3]<6.0.0,>=5.2.1 stripe<3.0.0,>=2.60.0 urllib3<2.0.0,>=1.25.10 whitenoise<6.0.0,>=5.3.0 yarl<2.0.0,>=1.7.2
And if we exclude 2 specific packages (sendgrid and sentry-sdk), the lambda gets built successfully for me. But unfortunately we need these 2 pkgs for our application.
e
I'm at a bit of a loss, this is not a big resolve:
Copy code
$ python -mvenv pex.venv
$ pex.venv/bin/pip -q install -U pip
$ pex.venv/bin/pip install pex
Collecting pex
  Using cached pex-2.1.56-py2.py3-none-any.whl (2.6 MB)
Installing collected packages: pex
Successfully installed pex-2.1.56
$ rm -rf ~/.pex && /usr/bin/time --verbose pex.venv/bin/pex --tmpdir .tmp --output-file src.pdl.internal_products.sales_dash/server-lambda.zip --manylinux=manylinux2014 --resolve-local-platforms --no-pypi --index=<https://pypi.org/simple/> --resolver-version pip-2020-resolver --platform linux_x86_64-cp-38-cp38 --no-emit-warnings --jobs 6 --manylinux manylinux2014 --sources-directory=source_files "PyYAML<6.0.0,>=5.4.1" "arn<0.2.0,>=0.1.5" "aws-psycopg2<2.0.0,>=1.2.1" "boltons<21.0.0,>=20.2.1" 'dataclasses<0.9.0,>=0.8; python_version < "3.7"' "django-crispy-forms<2.0.0,>=1.13.0" "django-taggit<2.0.0,>=1.1" "django<4.0.0,>=3.1.1" "elasticsearch==7.12.0" "jsonschema<4.0.0,>=3.0" "marshmallow-dataclass<9.0.0,>=8.2.0" "orjson<4.0.0,>=3.3.1" "phonenumbers<9.0.0,>=8.12" "prettytable<3.0.0,>=2.4.0" "psycopg2-binary<3.0.0,>=2" "python-box<6.0.0,>=5.4.1" "redis-py-cluster<3.0.0,>=2.0.0" "requests<3.0.0,>=2.24" "sendgrid<7.0.0,>=6.9.1" "sentry-sdk<2.0.0,>=1.5.0" "serverless-wsgi<3.0.0,>=2.0.2" "simple-salesforce<2.0.0,>=1.10.1" "slackclient<3.0.0,>=2.9" "smart-open[s3]<6.0.0,>=5.2.1" "stripe<3.0.0,>=2.60.0" "urllib3<2.0.0,>=1.25.10" "whitenoise<6.0.0,>=5.3.0" "yarl<2.0.0,>=1.7.2"
	Command being timed: "pex.venv/bin/pex --tmpdir .tmp --output-file src.pdl.internal_products.sales_dash/server-lambda.zip --manylinux=manylinux2014 --resolve-local-platforms --no-pypi --index=<https://pypi.org/simple/> --resolver-version pip-2020-resolver --platform linux_x86_64-cp-38-cp38 --no-emit-warnings --jobs 6 --manylinux manylinux2014 --sources-directory=source_files PyYAML<6.0.0,>=5.4.1 arn<0.2.0,>=0.1.5 aws-psycopg2<2.0.0,>=1.2.1 boltons<21.0.0,>=20.2.1 dataclasses<0.9.0,>=0.8; python_version < "3.7" django-crispy-forms<2.0.0,>=1.13.0 django-taggit<2.0.0,>=1.1 django<4.0.0,>=3.1.1 elasticsearch==7.12.0 jsonschema<4.0.0,>=3.0 marshmallow-dataclass<9.0.0,>=8.2.0 orjson<4.0.0,>=3.3.1 phonenumbers<9.0.0,>=8.12 prettytable<3.0.0,>=2.4.0 psycopg2-binary<3.0.0,>=2 python-box<6.0.0,>=5.4.1 redis-py-cluster<3.0.0,>=2.0.0 requests<3.0.0,>=2.24 sendgrid<7.0.0,>=6.9.1 sentry-sdk<2.0.0,>=1.5.0 serverless-wsgi<3.0.0,>=2.0.2 simple-salesforce<2.0.0,>=1.10.1 slackclient<3.0.0,>=2.9 smart-open[s3]<6.0.0,>=5.2.1 stripe<3.0.0,>=2.60.0 urllib3<2.0.0,>=1.25.10 whitenoise<6.0.0,>=5.3.0 yarl<2.0.0,>=1.7.2"
	User time (seconds): 58.44
	System time (seconds): 6.15
	Percent of CPU this job got: 127%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:50.51
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 154516
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 1
	Minor (reclaiming a frame) page faults: 1687687
	Voluntary context switches: 23697
	Involuntary context switches: 1932
	Swaps: 0
	File system inputs: 0
	File system outputs: 1064592
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0
$ ls -lh src.pdl.internal_products.sales_dash/server-lambda.zip 
-rwxr-xr-x 1 jsirois jsirois 53M Nov 29 18:05 src.pdl.internal_products.sales_dash/server-lambda.zip
@faint-businessperson-86903 and @agreeable-shampoo-91351 the more information you can dump about the diffetences in your machines, the better. That will help in this async-debugging.
a
@enough-analyst-54434 I tried the steps you specified with venv and pex and it hangs too !
Attaching more information on my system and memory,
Screen Shot 2021-11-30 at 4.54.40 PM.png
e
16GB should be more than enough. @agreeable-shampoo-91351 you're going to have to dig.
a
I use pyenv for python version not sure if that could be a problem
e
It shouldn't be.
Maybe try switching --resolver-version in the Pex command to pip-legacy-resolver. The 2020 resolver is known to be slower, especially in certain back-tracking scenarios.
And also add (9 vs) -vvvvvvvvv to the Pex command. That will provided detailed logging and hopefully reveal where the hang is. If you could run the PEX command then provide the full output after progress has stopped + a few minutes more, that could be revealing.
a
@enough-analyst-54434 I tried with the 9vs option and looks like i get an error with the version
ex: Patching environment markers for DistributionTarget(platform=Platform(platform='linux_x86_64', impl='cp', version='38', abi='cp38')) with {'implementation_name': 'cpython', 'os_name': 'posix', 'platform_machine': 'x86_64', 'platform_system': 'Linux', 'sys_platform': 'linux', 'platform_python_implementation': 'CPython', 'python_version': '3.8'} ERROR: Could not find a version that satisfies the requirement starkbank-ecdsa>=2.0.1 (from sendgrid<7.0.0,>=6.9.1) (from versions: none) ERROR: No matching distribution found for starkbank-ecdsa>=2.0.1 (from sendgrid<7.0.0,>=6.9.1) pid 31832 -> /Users/suriya/.pex/venvs/f4fe22a38ad1716580d169deb527663c98f3b86d/6ef51b88542fadfcd7979bfa8b96e10c12355019/pex --disable-pip-version-check --no-python-version-warning --exists-action a --use-deprecated legacy-resolver --isolated -vvv --cache-dir /Users/suriya/.pex --log /Users/suriya/.tmp/tmpeqarzwf5/pip.log download --dest /Users/suriya/.tmp/tmp9mvc9p9k/linux_x86_64-cp-38-cp38 --platform manylinux2014_x86_64 --platform linux_x86_64 --implementation cp --python-version 38 --abi cp38 --only-binary all PyYAML<6.0.0,>=5.4.1 arn<0.2.0,>=0.1.5 aws-psycopg2<2.0.0,>=1.2.1 boltons<21.0.0,>=20.2.1 dataclasses<0.9.0,>=0.8; python_version < "3.7" django-crispy-forms<2.0.0,>=1.13.0 django-taggit<2.0.0,>=1.1 django<4.0.0,>=3.1.1 elasticsearch==7.12.0 jsonschema<4.0.0,>=3.0 marshmallow-dataclass<9.0.0,>=8.2.0 orjson<4.0.0,>=3.3.1 phonenumbers<9.0.0,>=8.12 prettytable<3.0.0,>=2.4.0 psycopg2-binary<3.0.0,>=2 python-box<6.0.0,>=5.4.1 redis-py-cluster<3.0.0,>=2.0.0 requests<3.0.0,>=2.24 sendgrid<7.0.0,>=6.9.1 sentry-sdk<2.0.0,>=1.5.0 serverless-wsgi<3.0.0,>=2.0.2 simple-salesforce<2.0.0,>=1.10.1 slackclient<3.0.0,>=2.9 smart-open[s3]<6.0.0,>=5.2.1 stripe<3.0.0,>=2.60.0 urllib3<2.0.0,>=1.25.10 whitenoise<6.0.0,>=5.3.0 yarl<2.0.0,>=1.7.2 --index-url https://pypi.org/simple/ --retries 5 --timeout 15 exited with 1 and STDERR: None Command exited with non-zero status 1
And this is with the legacy resolver. With 2020 resolver it just hangs without any errors
@enough-analyst-54434 Looks like the same command runs without that version error on linux machine. Do you know why we get the version error only on macos?
Attaching verbose logs from linux and macos
e
Looking at the logs here presently, but, because of https://www.pantsbuild.org/docs/awslambda-python "Running from macOS and failing to build?", you'd expect certain lambdas to build on Linux but not on Mac. If just one required distribution is only available as an sdist, you won't be able to build that lambda on Mac.
AHa, this is identical to https://github.com/pantsbuild/pex/issues/1530 and https://github.com/pantsbuild/pex/discussions/1529. Thanks for filing those. Its often better for working out a problem and having a nice searchable record of the solution.
๐Ÿ‘ 1