Hello friends, I have a niche problem. I am shipp...
# general
a
Hello friends, I have a niche problem. I am shipping a pex to AWS Lambda, and I am including some dependencies, notably
pandas
in a lambda layer. I've reproduced my issue locally by running the aws lambda docker images with my layer and pex unzipped to
/opt/
and
/var/task
respectively. I have the following code in my pex:
Copy code
# src/dz/anomaly_flagger/handler.py
import pandas as pd

def handle(event, context):
    df = pd.DataFrame()
    print('yo', df.shape)
    print(df)
    print('whoa')
If I run the handler directly, everything executes:
Copy code
START RequestId: 454cf516-811d-460f-93c2-01e5b63159d4 Version: $LATEST
yo (0, 0)
Empty DataFrame
Columns: []
Index: []
whoa
END RequestId: 454cf516-811d-460f-93c2-01e5b63159d4
BUT if run the handler bootstrapped with pex (ie invoke.
__pex__<http://.src.dz|.src.dz>.anomaly_flagger/handler.py
), then printing the dataframe causes pex to start an interactive console.
Copy code
START RequestId: 831783bd-dc6a-43d0-8a41-a07594a55244 Version: $LATEST
yo (0, 0)
yo (0, 0)
>>> Python 3.9.16 (main, Dec 24 2022, 07:02:54)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)

now exiting InteractiveConsole...
21 Apr 2023 09:40:51,423 [WARNING] (rapid) First fatal error stored in appctx: Runtime.ExitError
21 Apr 2023 09:40:51,423 [WARNING] (rapid) Process 14(bootstrap) exited: Runtime exited without providing a reason
END RequestId: 831783bd-dc6a-43d0-8a41-a07594a55244
I've got a trace which I'll include in ๐Ÿงตbut I'm unsure how to interpret what's happening. My guess is that the bootstrapped process is failing for some weird library-related reason, and the console is a fallback behaviour, but I can't understand why printing a dataframe would cause a hiccup.
b
A few questions: does this behaviour require the two separate install locations? How are you installing/unzipping the PEXes? Does it reproduce outside of the lambda container/runtime? Can you share a reproduction?
a
I can share my local container image repro, or a repro for AWS Lambda - the former is easier to twiddle with. 1. does this behaviour require the two separate install locations Can you clarify the question? The two install locations are a restriction of the environment I'm running under, yes, if that's the question. Essentially I can package up to 250Mb of deps in total, most of them end up in /opt/ as a pre-built bundle and then a few bits are under /var/task. What's interesting to me is that pandas has resolved, because I can print(df.shape). Moreover, I can print a pandas.Series, I just can't print a pandas.DataFrame. 2. I'm not unzipping the pex myself, I'm depending on the fact that it's a zip, and AWS are unzipping it into a lambda vm at invocation time. Locally, in my container, I'm just running
unzip
. 3. I've not tried running the code outside of the container yet, I'll have a go.
Okay, I've uploaded a tar.gz here containing the layer.zip, and the pex file. Extract the tarball, and
docker build .
to create a reproduction container. To run the container without pex (prints an empty dataframe)
Copy code
$ docker run -p 8080:8080 -it $IMAGE src.dz.anomaly_flagger.handler.handle
$ curl -XPOST "<http://localhost:8080/2015-03-31/functions/function/invocations>" -d '{"payload":"hello world!"}'
To run with pex, just append dunder pex to the module, ie
Copy code
$ docker run -p 8080:8080 -it $IMAGE __pex__.src.dz.anomaly_flagger.handler.handle
$ curl -XPOST "<http://localhost:8080/2015-03-31/functions/function/invocations>" -d '{"payload":"hello world!"}'
b
Can you clarify the question?
Sorry, I was meaning: can we isolate the behaviour? e.g. ignore the 250MB restriction for local reproduction: maybe the cause is some interaction between
/var/task
and
/opt
, and it'd be helpful to know that.
Okay, I've uploaded a tar.gz here containing the layer.zip, and the pex file.
Unfortunately it looks like that's been deleted
a
How irksome. Here's another. https://ufile.io/7tdc5k6a Should be hosted for "a maximum of 30 days" which is sort of non-commital I'll try to create a repro without docker at some point, but it'll be a little funky, I think.
e
Sorry for the late notice here. @average-breakfast-91545 responding to the OP, did you see:
Copy code
pex: Dropping awslambdaric.lambda_runtime_exception
pex: Dropping awslambdaric.lambda_runtime_marshaller
pex: Dropping awslambdaric.lambda_runtime_client
pex: Dropping awslambdaric.bootstrap
pex: Dropping awslambdaric.__main__
Basically its almost certainly the case you don't want the PEX to be its usual hermetic self - scrubbing all deps not stdlib and not in the PEX - which is what is happening in those log lines. You want to let the ambient
sys.path
leak into the PEX via `--inherit-path {fallback|prefer}`(at build time) or `PEX_INHERIT_PATH={fallback,prefer}`at runtime: https://pex.readthedocs.io/en/v2.1.134/api/vars.html#PEX_INHERIT_PATH
a
No bother @enough-analyst-54434 - the help is appreciated. I did see that, but I'm unable to make sense of it. I've tried with PEX_INHERIT_PATH at runtime, that does yield slightly different behaviour, but the same end result:
Copy code
pex: PYTHONPATH contains:
pex:   *
pex:     /var/task/.deps/pandas_stubs-1.4.3.220710-py3-none-any.whl
pex:     /var/task
pex:   * /opt/python/lib/python3.9/site-packages
pex:     /opt/python
pex:     /var/runtime
pex:   * /var/lang/lib/python39.zip
pex:     /var/lang/lib/python3.9
pex:     /var/lang/lib/python3.9/lib-dynload
pex:     /var/lang/lib/python3.9/site-packages
pex:     /var/task/.bootstrap
pex:   * - paths that do not exist or will be imported via zipimport
>>> Python 3.9.16 (main, Dec 24 2022, 07:02:54)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)

now exiting InteractiveConsole...
24 Apr 2023 19:38:46,629 [WARNING] (rapid) First fatal error stored in appctx: Runtime.ExitError
24 Apr 2023 19:38:46,629 [WARNING] (rapid) Process 14(bootstrap) exited: Runtime exited without providing a reason
END RequestId: fb565395-6742-4c79-acb2-d5a43e648c96
Complete gist for both pex'd and non-pex'd runs here. WHat's interesting is that the first print statement
print("yo", df.shape)
executes twice, once on line 154 of the output, once on line 313. The pex file is being invoked by a script called bootstrap.py. My guess is that there's some conflict between the way pex and the lambda bootstrap are configuring the environment.
e
@average-breakfast-91545 have you tried the compatibility knobs for
pex_binary
? Namely
execution_mode="venv"
(https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode) and, if available in your Pants version
venv_site_packages_copies=True
(https://www.pantsbuild.org/docs/reference-pex_binary#codevenv_site_packages_copiescode)?
a
Let me go do some reading, and I'll do some knob twiddling tomorrow ๐Ÿ™‚ Thanks again.
e
Ok, great - please report back. The venv mode will get rid of all the scrubbing and sys.path composition weirdnesses.
a
There's just a lot of niche things happening here - it's a lambda, with a pex file, with external deps in a layer, that I've built from source, so it's kinda hard to know where to start ๐Ÿ˜‚
e
๐Ÿ‘ 1
a
If I build in venv execution mode, then I receive RuntimeImportErrors for packages in the layer, ie, those that are outside of the pex under /opt/python. This is true whether I set hermetic_scripts=False, or site_packages_copies=True, and with PEX_INHERIT_PATH=fallback or prefer.
e
@average-breakfast-91545 I'm being lazy here, but I'm also not an AWS Lambda user except to debug every few months. What exactly is the Lambda Layer base image I should use to repro? And, for completeness, the version of pandas you're using for the handler code you show above?
I guess I need more than the base image since you install pandas manually. I'd need as much of the whole kaboodle as you can provide.
a
It's a custom layer. I can make it public if you like? In the tarball I uploaded, it's present as
layer.zip
. The pandas version is 2.0, I think, but I can build another with a specific version.
Simplest repro is to use the tarball, build an image and test locally. That gives the same behaviour. I could neaten up and make a repository, it's just a little unwieldy to move the 70mb layer around.
e
Ah, ok - I missed the tarball link above.
a
Let me know if there's a better way to ship a repro. 70mb isn't too huge, but I don't really want to create a public s3 bucket: it twinges my conscience
e
It certainly is making pause. The UI claims "shenanigans.tar.gz" and the download is not a tarball, I get a "shenanigans_tar_gz.exe"
a
The tarball is - perhaps inadviably - shenanigans.tar.gz. I just downloaded it and did not receive an executable
so that's weird
e
Copy code
$ file shenanigans_tar_gz.exe
shenanigans_tar_gz.exe: PE32 executable (GUI) Intel 80386, for MS Windows
Let me see what I can do to trick the site into understanding not to be "helpful".
a
I wouldn't run it either ๐Ÿ˜‚
e
Copy code
$ file shenanigans_tar_gz.exe
shenanigans_tar_gz.exe: PE32 executable (GUI) Intel 80386, for MS Windows
a
On extracting the tarball,
docker build .
then you can run the image
docker run -p8080:8080 -e PEX_INHERIT_PATH=fallback -e PEX_VERBOSE=10 -it ba38059ed45c963a16434227329a24374d161e9be38132a8a75f2a1946be9 __<http://pex__.src.dz|pex__.src.dz>.anomaly_flagger.handler.handle
The lambda will wait for invocation, you can test it by POSTing to that port, eg.
curl -XPOST "<http://localhost:8080/2015-03-31/functions/function/invocations>" -d '{"payload":"hello world!"}'
Change the handler from
__pex__.
to
src.
and the function runs, otherwise, it tries to run a couple of times (which is weird in its own right) then dumps to an interactive console.
e
Here you go:
Copy code
$ cat Dockerfile
FROM public.ecr.aws/lambda/python:3.9
RUN yum install unzip -y
ADD layer.zip /opt/
ADD flagger.pex /var/task/
RUN cd /opt && unzip layer.zip

# Let the lambda runtime see the PEX file as a `sys.path` entry.
ENV PYTHONPATH=/var/task/flagger.pex

# Let the PEX runtime see the /opt/python `sys.path` entry.
ENV PEX_EXTRA_SYS_PATH=/opt/python
That works.
a
bruh
e
So bi-directional reveal and you don't unzip the PEX.
a
That was speedy!
e
Most debugging is when it's not remote. Thanks for the tarball!
a
Okay, so the setup will be slightly different in AWS cos I don't have a docker image, but I'll try setting both those env vars and see if that fixes it.
e
Sounds good, let me know.
N.B.: I did not supply
-e PEX_INHERIT_PATH=fallback
- un-needed with that Dockerfile setup.
a
Okay, progress! When the pex is shipped to Lambda, it'll be unzipped whether I like it or not, but if I set pythonpath to
/var/task
with an unzipped pex, the code runs. I don't have a mental model for why the PYTHONPATH var is necessary here. Any clues for a bewildered and grateful user?
e
With the PEX a single file as I had it PYTHONPATH adds that single file to the sys.path of the aws lambda bootstrap process. Since Python knows how to zipimort, that's good enough to get aws lambda to be able to
import __pex__.*
which is handled by an import hook installed by
__pex__/__init__.py
.
I suspect with the unzipping stuff aws lambda does, its unzipping the PEX into a directory it has already placed on the
sys.path
and so no need to repeat yourself.
a
That's what I'd expect and yet, without the pythonpath, the issue persists.
e
That will take a curious mind to run down. It depends how many unanswered questions you can live with. I can live with that one unanswered for now personally.
a
I'm happy to live with mystery for the moment! I might come back to this once we start to look at building a usable plugin for layers and the serverless framework. Many thanks, John.
e
Sounds good. You're welcome.
@average-breakfast-91545 if you're interested in relaying your solution, Miikka looks to be in the same boat: https://github.com/pantsbuild/pants/discussions/18841
๐Ÿ‘ 1
a
IS it fair to say that the python_lambda_function is unofficially deprecated? The mood round these parts seems to be "just use zipapp". I'm drafting a reply to that discussion.
e
That's about right. I'm the Pex maintainer and Pex 3.x will drop support for the APIs
lambdex
uses and
python_lambda_function
is powered by lambdex. It may be the case that Pants ports
python_lambda_function
from lambdex to straight PEX though; so the output may change but target stay the same? I have no clue. I can just say Lambdex is a dead man walking.
a
I have every intention of helping you kill it ๐Ÿ™‚
e
Thank you!