Hi. I've got a qustion regarding accessing binarie...
# general
a
Hi. I've got a qustion regarding accessing binaries installed in venv from outside pex. We're distributing docker images, built using pants, with .pexes inside, and deploying them on k8s cluster. Recently we wanted to start using celery, but we have to figure out a way to do liveliness checks in kubernetes. In order to do that, we have to invoke a celery command, namely
celery -A path.to.module inspect ping
. The problem here is that
celery
is a part of the .pex file, and it is unclear how to find it and execute it. One way to just find the binary would be to package the pexes in
execution_mode="venv"
, set
PEX_ROOT
to a known location and then to use
source $PEX_ROOT/venvs/s/*/venv/bin/activate
This way I can reach the binary. But the problem remains, because I can't reach the module I need (it is in user code). I suppose I'd also need to set up PYTHONPATH. Another way to reach the binary would be to just install
celery
itself separately in the system python distribution, but I would still need to reach the user code binary. Both the solutions seem like a lot of hackery and I'm not even sure if it's possible to pull any of them off. Is there an easier way to achieve this - accessing venv binaries without running the pex file itself? Or maybe you can override the entrypoint when running venv. Or maybe there is a better way to package docker images? Any help would be much appreciated
Okay, I think actually there is a very simple solution to this as the
script
in
pex_binary
is already set to
celery
so I can just run the pex itself with two different sets of arguments. However the problem would remain if I needed to run two different binaries
e.g. setting
script
to
celery
and needing to run anything else (can't think of any use case right now, but this could be anything that executes as a command like
uwsgi
or
gunicorn
)
e
If you follow this example: https://pex.readthedocs.io/en/v2.1.107/recipes.html#pex-app-in-a-container You get a venv at
/myapp
. Running
/myapp/pex
runs the PEX entry point, but you can also run any of the
/myapp/bin/<console script>
scripts the distributions in your PEX provide.
If you don't want to type the full paths just add one more line to the Dockerfile that adds
/myapp/bin
to
PATH
.
You can further tweak the generic Pex recipe above in the case of Pants as outlined here to get your container builds happening faster: https://blog.pantsbuild.org/optimizing-python-docker-deploys-using-pants/
a
That's fantastic, looks exactly like what I was looking for. Thank you so much 🙂
e
For those times when you can't pre-install a venv and you're stuck with a PEX file, you might be able to use conscript instead as your entry point to get a
busybox
-alike: https://pypi.org/project/conscript/
a
This works well, thank you. I have one more question regarding the blogpost. I tried to follow it to the very last configuration (deps and srcs split into 2 pexes). One minor difference in my setup is that I'd like to have the
script
field set in the
srcs
pex. I end up with this error though
pex.pex_builder.PEXBuilder.InvalidExecutableSpecification: Could not find script 'celery' in any distribution  within PEX!
It is reasonable, I excluded the requirements, but I know it will be there in the final Docker image. If I leave the script field only in the
deps
pex, it gets overwritten and the entrypoint ends up being python. Is there any way around it or do I have to resort to the first configuration with multi-stage build but single pex?
e
Hrm, yeah the split PEX optimization is a Pants-side hack and it doesn't cover all cases. I think you'd need to leave off the entry point & script for both
pex_binary
targets forming the split; then in the docker image set
PEX_SCRIPT
(which corresponds to
pex_binary.script
) or
PEX_MODULE
(which corresponds to
pex_binary.entry_point
). I personally don't like this at all, I'd rather have a pex_binary target in my BUILD that I can test and know the final PEX in the Docker image will work the same as. A bit too many moving parts for my taste - but it should work.
For more on PEX runtime environment variable control just
pex --help-variables
or see here: https://pex.readthedocs.io/en/v2.1.107/api/vars.html
a
It sure does feel a bit hacky. Anyway, I think the standard solution will be good enough for us. Thank you very much for all your help
b
Hey! Sorry to resurrect such an old thread, but can I get some help on how to run celery workers? I'm trying to run the most simple celery tutorial: https://docs.celeryq.dev/en/stable/getting-started/first-steps-with-celery.html
Copy code
# tasks.py
from celery import Celery

app = Celery('tasks', broker='<pyamqp://guest@localhost//>')

@app.task
def add(x, y):
    return x + y
Copy code
python_source(
    name="tasks",
    source="tasks.py",
    dependencies=[
        "third_party:requirements#celery",
    ],
)

pex_binary(
    name="celery_app",
    script="celery",
    dependencies=[":tasks"],
    layout="packed",
    execution_mode="venv",
)
I package with
pants package
but when I call:
Copy code
celery_app.pex/__main__.py -A tasks worker
I get:
Copy code
Unable to load celery application.
The module tasks was not found.
What am I missing?