Has anyone run into issues with DistributionNotFou...
# general
l
Has anyone run into issues with DistributionNotFound error with the google cloud functions integration?
google-cloud-bigquery-storage uses
pkg_resources.get_distribution
to retrieve the version.
e
@little-train-28371 you'll probably get better response with more info. Tool versions, command lines and full backtraces (redacted as required if need be) are always a great start.
l
i’m using pants 2.12.0 with python 3.9 I have no issues packaging my application using
./pants package ::
, the issues comes when I try to deploy the .zip package to cloud functions.
This is the traceback in google cloud console
Copy code
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/functions_framework/__init__.py", line 288, in create_app
    spec.loader.exec_module(source_module)
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/workspace/main.py", line 54, in <module>
    __RUNNER = __EntryPoint.parse("run = %s" % __lambdex_entry_point).resolve()
  File "/workspace/.bootstrap/pex/vendor/_vendored/setuptools/pkg_resources/__init__.py", line 2481, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/workspace/generic_file_processing_function/main.py", line 9, in <module>
    from common.audit_service_client import BigqueryAuditServiceClient
  File "/workspace/common/audit_service_client.py", line 8, in <module>
    from google.cloud import bigquery
  File "/workspace/.deps/google_cloud_bigquery-3.3.3-py2.py3-none-any.whl/google/cloud/bigquery/__init__.py", line 35, in <module>
    from google.cloud.bigquery.client import Client
  File "/workspace/.deps/google_cloud_bigquery-3.3.3-py2.py3-none-any.whl/google/cloud/bigquery/client.py", line 59, in <module>
    from google.cloud.bigquery_storage_v1.services.big_query_read.client import (
  File "/workspace/.deps/google_cloud_bigquery_storage-2.16.0-py2.py3-none-any.whl/google/cloud/bigquery_storage_v1/__init__.py", line 21, in <module>
    __version__ = pkg_resources.get_distribution(
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pkg_resources/__init__.py", line 471, in get_distribution
    dist = get_provider(dist)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pkg_resources/__init__.py", line 347, in get_provider
    return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0]
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pkg_resources/__init__.py", line 891, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/pkg_resources/__init__.py", line 777, in resolve
    raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'google-cloud-bigquery-storage' distribution was not found and is required by the application"
e
Thanks @little-train-28371. My guess is this is a namespace package issue. You might try using a
pex_binary
target instead of your current `python_google_cloud_function`target. For example:
Copy code
pex_binary(
    name=...same...
    platform="linux_x86_64-cp-39-cp39",
    resolve_local_platforms=True,
    execution_mode="venv",
    dependencies=[...same...]
)
The key difference from your current
python_google_cloud_function
target is the
execution_mode="venv"
- that should get rid of namespace package issues, maybe. The only other difference is that instead of specifying your handler in Google Cloud as
handler
as noted here: https://www.pantsbuild.org/docs/google-cloud-function-python#step-4-upload-to-google-cloud you just specify the actual entrypoint in your code. but prefixed with
__pex__.
. That allows the PEX to install itself in a venv out on Google Cloud when the GCF infrastructure tries to import your handler.
Oh, its right above, you're using Python 3.9, I'll edit the example.
l
Thanks, I’ll give this a try later. In the meantime, I’ve also raised a PR in googleapis to remove pkg_resources to extract version. Other packages have done a similar thing.
@enough-analyst-54434 only now resuming working on this. When you say to specify the entry point with
__pex__.
is that in
pex_binary(entry_point="__pex__.myfunction.main:main")
?
or is that inside my
myfunction/main.py
… where i would do something like
__entry__.main = main
e
You actually don't need to specify any
pex_binary
entry point. I'm just referring to the AWS lambda configuration out on AWS. It's there you need to tell AWS to load
__pex__.yourpackage.yourmodule.yourfunction
.
l
oh, got it. In my case is a GCF (gen2), and it doesn’t seem to like it.
ImportError: Error while finding loader for '__pex__.generic_file_processing_function.main.generic_file_processing' (<class 'ModuleNotFoundError'>: No module named '__pex__')
e
Ah, sorry, I missed the GCF part. Yeah, since GCF unconditionally looks for main.py, you must include a main.py in your pex. In that case you use the
__pex__
magic as a bare import before any other 3rdparty import, like so:
Copy code
$ cat src/main.py
import __pex__
import functions_framework
import cowsay


@functions_framework.http
def my_http_function(request):
    return cowsay.main.get_output_string("tux", "Hello Pex!")

$ pex --python python3.10 "functions-framework==3.*" cowsay==5.0 -D src -o gcf-example.zip

$ curl -D- <https://gcf-example-yktwqotlra-wl.a.run.app>
HTTP/2 200
content-type: text/html; charset=utf-8
x-cloud-trace-context: 949a3c86088d4c760779eb166538b455;o=1
date: Fri, 25 Nov 2022 15:33:12 GMT
server: Google Frontend
content-length: 268
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

  __________
| Hello Pex! |
  ==========
               \
                \
                 \
                  .--.
                 |o_o |
                 |:_/ |
                //   \ \
               (|     | )
              /'\_   _/`\
              \___)=(___/
l
haven’t tried this, but since .pex is a zip, would it work if I inject the main.py in the .pex?
e
Yes.
It doesn't matter how you get main.py in there, it just must be due to more stringent GCF rules than AWS Lambda rules.
l
gotcha, thanks!
Was not sure where that ___pex___ package came from, and then I saw another thread that this is a recent feature, so I upgraded to 2.14.0 (from 2.12.0) and now i see it in the .pex 🙂
Running into a different issue when the pex is deployed. Would you know what this may be due to?
Copy code
File "/layers/google.python.pip/pip/bin/functions-framework", line 8, in <module>
    sys.exit(_cli())
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/functions_framework/_cli.py", line 43, in _cli
    create_server(app, debug).run(host, port)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/functions_framework/_http/__init__.py", line 38, in run
    http_server.run()
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/functions_framework/_http/flask.py", line 25, in run
    self.app.run(self.host, self.port, debug=self.debug, **self.options)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/flask/app.py", line 920, in run
    run_simple(t.cast(str, host), port, self, **options)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/werkzeug/serving.py", line 1010, in run_simple
    inner()
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/werkzeug/serving.py", line 950, in inner
    srv = make_server(
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/werkzeug/serving.py", line 782, in make_server
    return ThreadedWSGIServer(
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/werkzeug/serving.py", line 679, in __init__
    server_address = get_sockaddr(host, int(port), self.address_family)
  File "/layers/google.python.pip/pip/lib/python3.9/site-packages/werkzeug/serving.py", line 626, in get_sockaddr
    res = socket.getaddrinfo(
  File "/layers/google.python.runtime/python/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
LookupError: unknown encoding: idna
e
Not really, but have you tried adding idna as a manual dependency?: https://pypi.org/project/idna/ I'm not sure if something in your stack has it as an extra or what really, but I'd start there.
l
I do see idna inside .deps. Seeing some references of doing a no-op import
e
Looking back up at your original post, this was all precipitated by a likely namespace package issue, and the difference using a raw PEX was going to be making it an
pex_binary(..., execution_mode="venv")
. I just want to make sure you've done that or will be doing that. If you do end up using that but running into issues, you may want to upgrade to Pants 2.15.x and also enable `venv_site_packages_copies=True`: https://www.pantsbuild.org/v2.15/docs/reference-pex_binary#codevenv_site_packages_copiescode
l
yeah, using
venv
mode
will try 2.15 now
e
To be clear, I don't think the idna issue will be helped by that, but who knows. Worth a shot.
l
Unfortunately no.