Having a bit of a tricky issue with pants and goog...
# general
d
Having a bit of a tricky issue with pants and google cloud functions. It appears like google cloud functions come pre-installed with some packages https://cloud.google.com/functions/docs/writing/specifying-dependencies-python#pre-installed_packages Previously, when manually deploying cloud functions (see comparison below) I’ve been able to override the version of flask from
Flask==2.0.2
to something newer like
Flask==2.3.2
. When deploying via pants I’m unable to alter this dependency. Why this matters: the rest of my monorepo shared libs have some dependency issues with older versions of flask and it would be a pain to downgrade. Project setup Inside
project/main.py
Copy code
import flask
import scipy

def example_handler(request):
    print('flask version', flask.__version__)
    print('scipy version', scipy.__version__)
    return f'scipy: {scipy.__version__}, flask {flask.__version__}'
inside
project/BUILD
Copy code
python_sources()

python_google_cloud_function(
    name="cloud_function",
    runtime="python311",
    handler="main.py:example_handler",
    type="event",
)

python_requirements(
    name="reqs",
)
project/requirements.txt
Copy code
Flask==2.3.2
scipy==1.10.1
Deploying with pants If I deploy this project using pants •
pants package project/main.py
◦ Gives me the debug output (confirming it picked up the right versions) ◦
6.99s Building project/cloud_function.zip with 2 requirements: Flask==2.3.2, scipy==1.10.1
gcloud storage cp dist/project/cloud_function.zip <gs://my-bucket>
gcloud functions deploy example --region=europe-west2 --source=<gs://my-bucket/cloud_function.zip> --entry-point=handler --trigger-http --runtime python311
Calling this function gives me:
Copy code
scipy: 1.10.1, flask 2.0.2
So pants/pex has ignored my
requirement.txt
pinned Flask version and preferred to use the pre-existing google cloud one. Deploying with regular gcloud if delete the previous deploying, then cd into
project
and run the following: •
gcloud functions deploy example --source=. --entry-point=example_handler --trigger-http --runtime=python311 --region=europe-west2
Calling this function now gives:
Copy code
scipy: 1.10.1, flask 2.3.2
Respecting the pinned packages in my
requirements.txt
which confirms it IS possible to override the preinstalled version of Flask Any idea here how I can get gcloud to respect my explicit flask version?
r
Not that I have solution for this problem but I think you can use
pex_binary
directly since that's what is being recommended for
lambdas
on aws. I would imagine the cloud function works same way since pants itself uses lambdex to ship both Please read this message, and the rest of the thread https://pantsbuild.slack.com/archives/C046T6T9U/p1670350622365489?thread_ts=1669819299.689539&amp;cid=C046T6T9U
d
I a tad lost, I was under the impression that under the hood the
python_google_cloud_function
target was first creating a
pex_binary
and then converting that into a zip file using
lambdex
? You’re suggesting I just make a
pex_binary
and then make a zip file manually with
lambdex
instead?
r
No, you just directly use
pex_binary
which is a pex file.
d
Understood, what’s the process of then deploying that binary as a google cloud function? Previously I’ve only ever used zip files. If I do something like
Copy code
gcloud functions deploy example --source=<gs://my-bucket/pex_binary.pex>
I get
Copy code
ERROR: (gcloud.functions.deploy) OperationError: code=3, message=Build failed: missing main.py and GOOGLE_FUNCTION_SOURCE not specified. Either create the function in main.py or specify GOOGLE_FUNCTION_SOURCE to point to the file that contains the function; Error ID: 5c04ec9c
And I’m not sure how i’d explicitly pass the correct entry point here. If I add
pex_binary(name="pex_binary", layout="loose", entry_point="main.py")
to my
BUILD
It gives me a directory structure like
Copy code
dist
└── project
    └── pex_binary.pex
        ├── PEX-INFO
        ├── __main__.py
        ├── __pex__
        │   └── __init__.py
        └── project
            └── main.py
Which means I can pass the
--source
dir properly (without it complaining it’s not a directory)
Copy code
gcloud functions deploy example --source=dist/project/pex_binary.pex --trigger-http --runtime python311 --region=europe-west2  --entry-point=project.main:handler
But it still gives me the same error.
b
Switching to pex_binary is probably a distraction from the core problem, and I don’t think it is necessary here. I think the only way to use a different version of a dependency for a specific target is to put it in a separate “resolve”. You may need to
parametrize
other targets (like any shared
python_sources
or
python_requirements
(other than flask)) by the two resolves, or maybe not; if the GCF code is fully isolated. I’m on my phone so i can’t dig out doc links but maybe those keywords are enough to get started. Feel free to ask more!
This is all using pants native GCF support, which packages dependencies into the GCF zip, rather than letting GCF install them from a requirements.txt in the zip. An alternative would be to disable pants behaviour here (
include_requiremenrs=False
on the GCF target), and convince it to include a requirements.txt in the zip somehow, as a
resource
. I’m not sure how the latter might be possible. It also means losing some of pants’ powers around dep management (like lockfiles and inference), but that might be okay.
d
I think the only way to use a different version of a dependency for a
specific target is to put it in a separate “resolve”. You may need to
parametrize
other targets (like any shared
python_sources
or
python_requirements
(other than flask)) by the two resolves, or maybe not; if the GCF code is fully isolated.
Some extra info here would be appreciated. Not too sure what this would involve. I have an example repo here with all the code: https://github.com/Jackevansevo/pants-google-cloud-function-sandbox I think what might work here is a
requirements.txt
file with just the Flask entry inside it, because the rest of the deps are vendored into the pex file right? So I might explore this option.
I just tried manually including a
requirements.txt
file into the base of the zip file and this appears to work (the Flask version gets bumped to
2.3.0
. If I include:
Copy code
python_sources(dependencies=[":req_file"])
resource(source="requirements.txt", name="req_file")
This adds a requirements.txt file to my
project/requirements.txt
I wonder if there's a mechanism to get pants to include this resource in the parent folder instead
b
Ah, hmmm, I might've misread your original comment. I thought you had most of your code trying to use Flask==2.0.2, and have one GCF package override to 2.3... but it seems like you're happy with 2.3 through out the whole repo! Sorry about that: it means my comment about resolves and parametrization is irrelevant.
You're right about vendoring dependencies by default... it's weird to me that GCF is ignoring the explicitly vendored dependency in favour of the globally installed one, and it takes a
requirements.txt
to actually get it to override it. There's a chance this may be a result of the Lambdex/PEX packaging: as you may've noticed from exploring the zip file, the layout isn't the simple "main.py" with adjacent dependencies. On start-up, the dependencies are first unzipped from
.deps
into a temporary folder, and then your actual handler code is executed with the sys.path modified to point to them. And, the PEX project does a lot of work to try to make this execution hermetic (i.e. independent of any existing packages etc.), but maybe that's not working here? So... some options here: 1. try 2.17.0rc1, which switches away from Lambdex to the simpler layout, and no dynamic start-up (this is getting into experimental territory: I think my testing on AWS Lambda has been the only testing; I haven't tried it on GCF) 2. debug this interaction between Lambdex/PEX and GCF, e.g. set
PEX_VERBOSE=3
environment variable and look at the start-up logging...
👀 1