Hi there, what is the option `execution_mode = "ve...
# general
p
Hi there, what is the option
execution_mode = "venv"
used for when building a PEX file? Is it used to unzip the PEX file when it is executed? Thanks
h
Hi! It impacts how the PEX is executed, with different tradeoffs for performance See https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode
👍 1
p
Hi Eric, I tried to use "execution_mode=unzip" to unzip the PEX when running it in docker. But the PEX file is unzipped to a random folder under "/tmp" which is not known beforehand.
The context is I packed Airflow into a PEX file which depends on "gunicorn". I have verified that the Airflow PEX file contains "gunicorn", which is perfect. However, the path of "gunicorn" inside the unzipped PEX file (a folder under /tmp) is not known before hand, and hence when Airflow launches gunicorn as a subprocess, it errors out immediately. I'd like to know if there is a way to specify that target dir that the PEX file is unzipped into when being executed? If this is feasible, then the path to "gunicorn" is known beforehand and can be saved to $PATH env, so it airflow can launch it as a subprocess.
h
How does airflow reference gunicorn? Does it assume it's installed as a script in the interpreter/virtualenv?
Using venv pex should work if so
p
Hey Ben, Thanks for your reply! I have tried setting execution_mode to "venv" and "unzip" but none of these solve my problem yet. I made a doc to explain in more detail https://docs.google.com/document/d/1bKjy5j88NIWHC8xTOHL6jYD0pmU4lKEmsdyph9xVqxk/edit#
If you are OK, we can discuss in person via zoom so that I can show you the code. Really appreciate your help!
h
I'll take a look at the doc, and am happy to take a look at the code once I do 🙂
👍 1
h
Hey I think you need to update permissions on gdoc @powerful-florist-1807
p
Hey Eric, I just shared the doc again. Please check if you can open it.
h
Haven't forgotten about this, looking now
p
Hi Ben, really appreciate your help!
h
Following up:
If I build the pex outside of Pants, like this:
Copy code
pex apache-airflow[async,crypto,celery,kubernetes,jdbc,password,postgres,s3,slack,amazon] -o airflow.pex
Copy code
airflow db init

airflow users create \
    --username admin \
    --firstname Peter \
    --lastname Parker \
    --role Admin \
    --email <mailto:spiderman@superhero.org|spiderman@superhero.org>

airflow webserver --port 8080
Then gunicorn starts up fine
So I'm wondering what's different
p
Thanks a lot Ben! I will give it a try for sure.
h
Presumably it works because gunicorn exports itself as a script, and pex handles those correctly
So I'm curious how you're building the pex in Pants, and what might be different here
I think this should "just work" since gunicorn is in the virtualenv's
bin/
directory (since it exports itself as a [script](https://python-packaging.readthedocs.io/en/latest/command-line-scripts.html) in its distribution)
So if it doesn't in some circumstanes, that may be a bug
Let me know if this helps or you need me to dig deeper. To do so it would be great to know how you're creating the pex (what the target looks like and what package command you're running), and how you're executing the pex when the error happens.
p
I can try to create a public repo to duplicate the code. Will first try our your solution.
h
Hey @powerful-florist-1807 just checking in to see if this airflow-gunicorn thing is still an issue?
p
Hi Ben - thanks for the follow-up. I need more time to try your approach. Hopefully early next I can let you know.
b
Hey @happy-kitchen-89482, I’m working with @powerful-florist-1807 and was looking at this issue yesterday. When the airflow pex is running, it will subprocess.popen a gunicorn process, but by nature of it, it won’t inherit the sys.path, which causes the gunicorn to fail look up the module and bind to the service. I guess if the pex’s unzip directory is pre known, we could set the PYTHONPATH env var.
effectively 2 things I’m thinking of 1. how does pex look up the script shipped inside its deps, in this case gunicorn. This option looks promising 2. after it can find gunicorn binary, when pex subproces a gunicorn process, that doesn’t go thru the pex sys.path massage, so it won’t have all the deps in there
h
I'll take a look. Do you mind if I open a ticket for this on GitHub, so we can track it properly there? Slack is pretty ephemeral.
b
yeah that would be 👍
h
If you could comment there that would be great
p
Great
h
and if you can post some steps to reproduce the problem that would be awesome
Like how you're building the airflow pex
p
I think it will be a lot easier for you to access our code and see a repro.
I can follow up with Leo on this.
b
@powerful-florist-1807 we could get a quick poc repo as of now
p
Do you mean we can reproduce via docker-compose run locally? That will be great!
b
yeah I think it doesn’t need to have any company code, it’s just running airflow + pex + docker, so we should be able to make some simple repo with the repro
I will get it after lunch
p
That will be nice! I wonder if you could put your code to a public repo?
b
yeah a public one
p
Cool
b
docker compose up
should expose the error