average-breakfast-91545
04/21/2023, 9:43 AMpandas
in a lambda layer. I've reproduced my issue locally by running the aws lambda docker images with my layer and pex unzipped to /opt/
and /var/task
respectively.
I have the following code in my pex:
# src/dz/anomaly_flagger/handler.py
import pandas as pd
def handle(event, context):
df = pd.DataFrame()
print('yo', df.shape)
print(df)
print('whoa')
If I run the handler directly, everything executes:
START RequestId: 454cf516-811d-460f-93c2-01e5b63159d4 Version: $LATEST
yo (0, 0)
Empty DataFrame
Columns: []
Index: []
whoa
END RequestId: 454cf516-811d-460f-93c2-01e5b63159d4
BUT if run the handler bootstrapped with pex (ie invoke. __pex__<http://.src.dz|.src.dz>.anomaly_flagger/handler.py
), then printing the dataframe causes pex to start an interactive console.
START RequestId: 831783bd-dc6a-43d0-8a41-a07594a55244 Version: $LATEST
yo (0, 0)
yo (0, 0)
>>> Python 3.9.16 (main, Dec 24 2022, 07:02:54)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
now exiting InteractiveConsole...
21 Apr 2023 09:40:51,423 [WARNING] (rapid) First fatal error stored in appctx: Runtime.ExitError
21 Apr 2023 09:40:51,423 [WARNING] (rapid) Process 14(bootstrap) exited: Runtime exited without providing a reason
END RequestId: 831783bd-dc6a-43d0-8a41-a07594a55244
I've got a trace which I'll include in ๐งตbut I'm unsure how to interpret what's happening. My guess is that the bootstrapped process is failing for some weird library-related reason, and the console is a fallback behaviour, but I can't understand why printing a dataframe would cause a hiccup.broad-processor-92400
04/21/2023, 11:59 AMaverage-breakfast-91545
04/21/2023, 12:05 PMunzip
.
3. I've not tried running the code outside of the container yet, I'll have a go.docker build .
to create a reproduction container. To run the container without pex (prints an empty dataframe)
$ docker run -p 8080:8080 -it $IMAGE src.dz.anomaly_flagger.handler.handle
$ curl -XPOST "<http://localhost:8080/2015-03-31/functions/function/invocations>" -d '{"payload":"hello world!"}'
To run with pex, just append dunder pex to the module, ie
$ docker run -p 8080:8080 -it $IMAGE __pex__.src.dz.anomaly_flagger.handler.handle
$ curl -XPOST "<http://localhost:8080/2015-03-31/functions/function/invocations>" -d '{"payload":"hello world!"}'
broad-processor-92400
04/22/2023, 12:24 AMCan you clarify the question?Sorry, I was meaning: can we isolate the behaviour? e.g. ignore the 250MB restriction for local reproduction: maybe the cause is some interaction between
/var/task
and /opt
, and it'd be helpful to know that.
Okay, I've uploaded a tar.gz here containing the layer.zip, and the pex file.Unfortunately it looks like that's been deleted
average-breakfast-91545
04/24/2023, 10:50 AMenough-analyst-54434
04/24/2023, 6:16 PMpex: Dropping awslambdaric.lambda_runtime_exception
pex: Dropping awslambdaric.lambda_runtime_marshaller
pex: Dropping awslambdaric.lambda_runtime_client
pex: Dropping awslambdaric.bootstrap
pex: Dropping awslambdaric.__main__
Basically its almost certainly the case you don't want the PEX to be its usual hermetic self - scrubbing all deps not stdlib and not in the PEX - which is what is happening in those log lines. You want to let the ambient sys.path
leak into the PEX via `--inherit-path {fallback|prefer}`(at build time) or `PEX_INHERIT_PATH={fallback,prefer}`at runtime: https://pex.readthedocs.io/en/v2.1.134/api/vars.html#PEX_INHERIT_PATHaverage-breakfast-91545
04/24/2023, 8:39 PMpex: PYTHONPATH contains:
pex: *
pex: /var/task/.deps/pandas_stubs-1.4.3.220710-py3-none-any.whl
pex: /var/task
pex: * /opt/python/lib/python3.9/site-packages
pex: /opt/python
pex: /var/runtime
pex: * /var/lang/lib/python39.zip
pex: /var/lang/lib/python3.9
pex: /var/lang/lib/python3.9/lib-dynload
pex: /var/lang/lib/python3.9/site-packages
pex: /var/task/.bootstrap
pex: * - paths that do not exist or will be imported via zipimport
>>> Python 3.9.16 (main, Dec 24 2022, 07:02:54)
[GCC 7.3.1 20180712 (Red Hat 7.3.1-15)] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
now exiting InteractiveConsole...
24 Apr 2023 19:38:46,629 [WARNING] (rapid) First fatal error stored in appctx: Runtime.ExitError
24 Apr 2023 19:38:46,629 [WARNING] (rapid) Process 14(bootstrap) exited: Runtime exited without providing a reason
END RequestId: fb565395-6742-4c79-acb2-d5a43e648c96
Complete gist for both pex'd and non-pex'd runs here.
WHat's interesting is that the first print statement print("yo", df.shape)
executes twice, once on line 154 of the output, once on line 313. The pex file is being invoked by a script called bootstrap.py. My guess is that there's some conflict between the way pex and the lambda bootstrap are configuring the environment.enough-analyst-54434
04/24/2023, 8:46 PMpex_binary
? Namely execution_mode="venv"
(https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode) and, if available in your Pants version venv_site_packages_copies=True
(https://www.pantsbuild.org/docs/reference-pex_binary#codevenv_site_packages_copiescode)?average-breakfast-91545
04/24/2023, 8:46 PMenough-analyst-54434
04/24/2023, 8:47 PMaverage-breakfast-91545
04/24/2023, 8:47 PMenough-analyst-54434
04/24/2023, 8:47 PMvenv_hermetic_scripts=False
if available (https://www.pantsbuild.org/v2.16/docs/reference-pex_binary#codevenv_hermetic_scriptscode).average-breakfast-91545
04/25/2023, 12:09 PMenough-analyst-54434
04/25/2023, 2:59 PMaverage-breakfast-91545
04/25/2023, 3:03 PMlayer.zip
. The pandas version is 2.0, I think, but I can build another with a specific version.enough-analyst-54434
04/25/2023, 3:06 PMaverage-breakfast-91545
04/25/2023, 3:08 PMenough-analyst-54434
04/25/2023, 3:10 PMaverage-breakfast-91545
04/25/2023, 3:11 PMenough-analyst-54434
04/25/2023, 3:11 PM$ file shenanigans_tar_gz.exe
shenanigans_tar_gz.exe: PE32 executable (GUI) Intel 80386, for MS Windows
Let me see what I can do to trick the site into understanding not to be "helpful".average-breakfast-91545
04/25/2023, 3:11 PMenough-analyst-54434
04/25/2023, 3:11 PM$ file shenanigans_tar_gz.exe
shenanigans_tar_gz.exe: PE32 executable (GUI) Intel 80386, for MS Windows
average-breakfast-91545
04/25/2023, 3:15 PMdocker build .
then you can run the image docker run -p8080:8080 -e PEX_INHERIT_PATH=fallback -e PEX_VERBOSE=10 -it ba38059ed45c963a16434227329a24374d161e9be38132a8a75f2a1946be9 __<http://pex__.src.dz|pex__.src.dz>.anomaly_flagger.handler.handle
The lambda will wait for invocation, you can test it by POSTing to that port, eg. curl -XPOST "<http://localhost:8080/2015-03-31/functions/function/invocations>" -d '{"payload":"hello world!"}'
Change the handler from __pex__.
to src.
and the function runs, otherwise, it tries to run a couple of times (which is weird in its own right) then dumps to an interactive console.enough-analyst-54434
04/25/2023, 3:36 PM$ cat Dockerfile
FROM public.ecr.aws/lambda/python:3.9
RUN yum install unzip -y
ADD layer.zip /opt/
ADD flagger.pex /var/task/
RUN cd /opt && unzip layer.zip
# Let the lambda runtime see the PEX file as a `sys.path` entry.
ENV PYTHONPATH=/var/task/flagger.pex
# Let the PEX runtime see the /opt/python `sys.path` entry.
ENV PEX_EXTRA_SYS_PATH=/opt/python
That works.average-breakfast-91545
04/25/2023, 3:36 PMenough-analyst-54434
04/25/2023, 3:36 PMaverage-breakfast-91545
04/25/2023, 3:36 PMenough-analyst-54434
04/25/2023, 3:36 PMaverage-breakfast-91545
04/25/2023, 3:38 PMenough-analyst-54434
04/25/2023, 3:38 PM-e PEX_INHERIT_PATH=fallback
- un-needed with that Dockerfile setup.average-breakfast-91545
04/25/2023, 5:53 PM/var/task
with an unzipped pex, the code runs.
I don't have a mental model for why the PYTHONPATH var is necessary here. Any clues for a bewildered and grateful user?enough-analyst-54434
04/25/2023, 5:57 PMimport __pex__.*
which is handled by an import hook installed by __pex__/__init__.py
.sys.path
and so no need to repeat yourself.average-breakfast-91545
04/25/2023, 5:59 PMenough-analyst-54434
04/25/2023, 5:59 PMaverage-breakfast-91545
04/25/2023, 6:00 PMenough-analyst-54434
04/25/2023, 6:01 PMaverage-breakfast-91545
04/27/2023, 4:35 PMenough-analyst-54434
04/27/2023, 4:43 PMlambdex
uses and python_lambda_function
is powered by lambdex. It may be the case that Pants ports python_lambda_function
from lambdex to straight PEX though; so the output may change but target stay the same? I have no clue. I can just say Lambdex is a dead man walking.average-breakfast-91545
04/27/2023, 4:43 PMenough-analyst-54434
04/27/2023, 4:43 PM