thousands-plumber-33255
02/09/2023, 8:30 AMimport os
import sys
def run_manage():
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "django_core.settings")
args = sys.argv
from django.core.management import execute_from_command_line
execute_from_command_line(args)
if __name__ == "__main__":
run_manage()
The PEX target like this:
pex_binary(
name="manage",
entry_point="manage.py",
restartable=True,
)
There is no issue running ./pants run django/manage.py -- runserver
.
But when running the PEX file in this docker container:
ARG PYTHON_VERSION
ARG VARIANT
FROM python:${PYTHON_VERSION}-${VARIANT}
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
COPY django/manage.pex /bin
EXPOSE 8000
CMD [ "/bin/manage.pex", "--", "runserver"]
I get this error: Unknown command: '--' Type 'manage.py help' for usage.
. It should be noted that it seems the app gets initialized as it throws an error when an env is missing that is used in settings.py.
When I remove "--" it just throws:
Traceback (most recent call last):
File "/root/.pex/unzipped_pexes/4385bc9a1b5d88169d1c8cca8cfb039ab5e16e21/manage.py", line 12, in <module>
run_manage()
File "/root/.pex/unzipped_pexes/4385bc9a1b5d88169d1c8cca8cfb039ab5e16e21/manage.py", line 7, in run_manage
from django.core.management import execute_from_command_line
ModuleNotFoundError: No module named 'django'
When I check the pex I can see django in .deps
:
root@b11c2d409264:/bin/manage/.deps# ls -la | grep Django
drwxr-xr-x 5 root root 4096 Jan 1 1980 Django-3.2-py3-none-any.whl
What am I doing wrong?melodic-carpenter-39613
02/09/2023, 9:23 AM./pants package ::
my pex binaries end up under dist/
, however are you then using pants to also build the Dockerfile
?
I took a slightly different approach as I wanted to also do migrations, etc. when starting the docker image, so I ended up with a wrapper script start-server.sh
pointing to the PEX, which sets PEX_MODULE
environment variable to manage
(for the python import path to manage.py
within the pex itself) and uses that to then pass arguments such as runserver
in without the --
.melodic-carpenter-39613
02/09/2023, 9:24 AM$PEX_FILE
is the path to the pex within the docker image, so /bin/manage.pex
in your case 😉thousands-plumber-33255
02/09/2023, 9:26 AMmelodic-carpenter-39613
02/09/2023, 9:28 AMENV PEX_MODULE=manage
before removing the --
in your Dockerfile CMD
on the next line?thousands-plumber-33255
02/09/2023, 9:28 AMthousands-plumber-33255
02/09/2023, 9:28 AMthousands-plumber-33255
02/09/2023, 9:29 AMmelodic-carpenter-39613
02/09/2023, 9:31 AMmakemigrations
, but I think it's probably to ensure that any external databases for the given environment are up to date.melodic-carpenter-39613
02/09/2023, 9:33 AMmelodic-carpenter-39613
02/09/2023, 9:33 AMmelodic-carpenter-39613
02/09/2023, 9:34 AMsrc.python.apps.django_backend/gunicorn_run.pex
dot path syntax - this was what worked as it is relative to the dist/
directory of pantsmelodic-carpenter-39613
02/09/2023, 9:35 AMthousands-plumber-33255
02/09/2023, 1:22 PMENV PEX_MODULE=manage
CMD [ "/bin/manage.pex", "runserver"]
Did not help, still ModuleNotFoundError: No module named 'django'
thousands-plumber-33255
02/09/2023, 1:58 PM[ "/bin/manage.pex", "help"]
and [ "/bin/manage.pex", "migrate"]
actually returns something. Could this be related to the fact that my django directory is called 'django'enough-analyst-54434
02/09/2023, 2:25 PMTraceback (most recent call last):
File "/root/.pex/unzipped_pexes/4385bc9a1b5d88169d1c8cca8cfb039ab5e16e21/manage.py", line 12, in <module>
run_manage()
That indicates the django/
housing django/manage.py
is stripped since manage.py
is at the root of the unzipped PEX (they are namespaced by their hash).
@thousands-plumber-33255 can you provide the output of?:
unzip -qc dist/...your.pex PEX-INFO | jq .
If you don't have jq you can leave the trailing pipe clause off.thousands-plumber-33255
02/09/2023, 2:50 PMenough-analyst-54434
02/09/2023, 3:03 PMexecution_mode="venv"
to your pex_binary
target. I'm trying to confirm on the side, but what is most likely happening is that one of the `django*`distributions in the PEX does not use namespace packages properly. Having PEX install itself in a venv fixes these sorts of issues.
As an aside its also better for a host of reasons. Have you seen any of these?:
+ https://pex.readthedocs.io/en/v2.1.121/recipes.html#pex-app-in-a-container
+ https://blog.pantsbuild.org/optimizing-python-docker-deploys-using-pants/thousands-plumber-33255
02/09/2023, 3:27 PMenough-analyst-54434
02/09/2023, 3:27 PMjsirois@Gill-Windows:~/support/pex/ChrisStetter $ zipinfo -1 test.pex | grep /django/ | head
.deps/Django-3.2-py3-none-any.whl/django/
.deps/Django-3.2-py3-none-any.whl/django/__init__.py
.deps/Django-3.2-py3-none-any.whl/django/__main__.py
.deps/Django-3.2-py3-none-any.whl/django/shortcuts.py
.deps/Django-3.2-py3-none-any.whl/django/urls/
.deps/Django-3.2-py3-none-any.whl/django/urls/__init__.py
.deps/Django-3.2-py3-none-any.whl/django/urls/base.py
.deps/Django-3.2-py3-none-any.whl/django/urls/conf.py
.deps/Django-3.2-py3-none-any.whl/django/urls/converters.py
.deps/Django-3.2-py3-none-any.whl/django/urls/exceptions.py
jsirois@Gill-Windows:~/support/pex/ChrisStetter $ zipinfo -1 test.pex | grep /django/ | tail
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/__init__.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/base.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/console.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/dummy.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/filebased.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/locmem.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/smtp.py
.deps/Django-3.2-py3-none-any.whl/django/bin/
.deps/Django-3.2-py3-none-any.whl/django/bin/django-admin.py
Your hunch may have been right. Can you do a similar grep on your PEX and confirm or deny `Django-3.2-py3-none-any.whl`is the only provider of the django
directory?thousands-plumber-33255
02/09/2023, 3:27 PMenough-analyst-54434
02/09/2023, 3:28 PMdjango
in your PEX.thousands-plumber-33255
02/09/2023, 3:28 PMthousands-plumber-33255
02/09/2023, 3:29 PMvscode ➜ /repo $ zipinfo -1 dist/django/manage.pex | grep /django/ | head
.deps/Django-3.2-py3-none-any.whl/django/
.deps/Django-3.2-py3-none-any.whl/django/__init__.py
.deps/Django-3.2-py3-none-any.whl/django/__main__.py
.deps/Django-3.2-py3-none-any.whl/django/shortcuts.py
.deps/Django-3.2-py3-none-any.whl/django/urls/
.deps/Django-3.2-py3-none-any.whl/django/urls/__init__.py
.deps/Django-3.2-py3-none-any.whl/django/urls/base.py
.deps/Django-3.2-py3-none-any.whl/django/urls/conf.py
.deps/Django-3.2-py3-none-any.whl/django/urls/converters.py
.deps/Django-3.2-py3-none-any.whl/django/urls/exceptions.py
enough-analyst-54434
02/09/2023, 3:29 PMthousands-plumber-33255
02/09/2023, 3:30 PMenough-analyst-54434
02/09/2023, 3:30 PMtail
instead of head
?thousands-plumber-33255
02/09/2023, 3:30 PMvscode ➜ /repo $ zipinfo -1 dist/django/manage.pex | grep /django/ | tail
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/__init__.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/base.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/console.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/dummy.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/filebased.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/locmem.py
.deps/Django-3.2-py3-none-any.whl/django/core/mail/backends/smtp.py
.deps/Django-3.2-py3-none-any.whl/django/bin/
.deps/Django-3.2-py3-none-any.whl/django/bin/django-admin.py
enough-analyst-54434
02/09/2023, 3:30 PMenough-analyst-54434
02/09/2023, 3:31 PMthousands-plumber-33255
02/09/2023, 3:31 PMenough-analyst-54434
02/09/2023, 3:31 PMthousands-plumber-33255
02/09/2023, 3:36 PMenough-analyst-54434
02/09/2023, 3:43 PMairflow
: https://github.com/apache/airflow/blob/main/setup.cfg#L166-L168
+ Using pex that's pex c airflow ...
where `-c`sets a console script entrypoint.
+ Using Pants that's: https://www.pantsbuild.org/docs/reference-pex_binary#codescriptcode
+ If you already have the built PEX file, you can also ad-hoc use PEX_SCRIPT=ariflow my.pex ...
(see pex --help-variables
or https://pex.readthedocs.io/en/v2.1.121/api/vars.html for more runtime env var control knobs)thousands-plumber-33255
02/09/2023, 3:47 PMpex_binary(
name="airflow",
script="airflow",
restartable=True,
)
results in
15:45:41.84 [ERROR] 1 Exception encountered:
Engine traceback:
in `run` goal - environment:linux_devcontainer
ProcessExecutionFailure: Process 'Building airflow/airflow.pex' failed with exit code 1.
stdout:
stderr:
Could not find script 'airflow' in any distribution within PEX!
thousands-plumber-33255
02/09/2023, 3:47 PMthousands-plumber-33255
02/09/2023, 3:47 PMenough-analyst-54434
02/09/2023, 3:48 PMenough-analyst-54434
02/09/2023, 3:49 PMthousands-plumber-33255
02/09/2023, 3:52 PMenough-analyst-54434
02/09/2023, 3:58 PMdependencies
(list of strings) field to list all dependencies that are not inferrable - thats for both 1st and 3rdparty dependencies. If the PEX if just 3rdparty - day airflow - something like:
pex_binary(
...
script="airflow",
dependencies=["3rdaprty/python:reqs#airflow"]
)
If 1st party, something like:
python_sources() # This will "own" my_entry_point.py
pex_binary(
...
entry_point="./my_entry_point.py"
)
Either way - you should just need to include a single explicit dependency in the pex_binary
target and that dependency should include the entry_point / console script. Pants will infer the transitive closure of dependencies from that single entry point / console script dependency.thousands-plumber-33255
02/09/2023, 4:07 PMdependencies=["airflow/dags"]
. So now those dependencies are included in the pex. But airflow is expecting the dags dir to be in the AIRFLOW_HOME dir. By default this is ~/airflow
and I can see that files are being created there when I run ./pants run airflow:airflow
. But of course the dags are not present in that home folder but in the pex. Later on I will deploy this in a Dockerfile, but I think the issue is the same: How can I handle this case? From the django example I have seen that the pex is executed in some unstable dir like /root/.pex/unzipped_pexes/hash
so adding this is not stable.enough-analyst-54434
02/09/2023, 4:10 PMenough-analyst-54434
02/09/2023, 4:12 PM__file__
and calculating relative to that. Using __file__
in this way will require venv mode so that all code - 1st and 3rdparty, shares the same (site-packages) root.thousands-plumber-33255
02/09/2023, 4:15 PMthousands-plumber-33255
02/10/2023, 3:33 PMdags
dependencies (as specified in the BUILD file) are now in /bin/app/lib/python3.8/site-packages/dags
. Is there any way to tell PEX that this dependency should be placed somewhere else? I could not find anything in the PEX or pants docuenough-analyst-54434
02/10/2023, 3:36 PMAIRFLOW_X
though couldn't you?thousands-plumber-33255
02/10/2023, 3:36 PMthousands-plumber-33255
02/10/2023, 4:36 PMpex_binary(
name="airflow",
# <https://github.com/apache/airflow/blob/main/setup.cfg#L166-L168>
script="airflow",
restartable=True,
layout = "packed",
execution_mode="venv",
include_tools = True,
dependencies=[":dags"],
)
docker_image(name="docker")
resources(
name="dags",
sources=[
"./dags/**/*.py",
],
)
thousands-plumber-33255
02/10/2023, 4:41 PMenough-analyst-54434
02/10/2023, 4:44 PMresources
and not python_sources
?enough-analyst-54434
02/10/2023, 4:44 PMthousands-plumber-33255
02/11/2023, 6:48 AM07:00:23.77 [WARN] The target airflow/dags/spatial/planninRegionDataInitial/src/planning_regions_initializing.py:../../../../dags imports `dags.base.dagOperatorSessionContext.session_scope`, but Pants cannot safely infer a dependency because more than one target owns this module, so it is ambiguous which to use: ['airflow/dags/base/dagOperatorSessionContext.py', 'airflow/dags/base/dagOperatorSessionContext.py:../../dags'].
Of course I could add all python target adresses manually but that would be quite error prone when adding new files.thousands-plumber-33255
02/11/2023, 7:37 AM./pants dependencies --transitive airflow:airflow
I can see django/spatial/**
files being included and I suspect that it comes from the fact that I also have a airflow/dags/spatial
directory. Because nothing in the dags directory is importing from the django code. How can this be resolved?enough-analyst-54434
02/11/2023, 3:20 PMmy explicit one and the auto-generated onesCan you turn off the auto-generated ones? I assume you mean
pants tailor
here? It has knobs to ignore targets and subtrees: See the 2 ignore options starting here: https://www.pantsbuild.org/docs/reference-tailor#ignore_pathsenough-analyst-54434
02/11/2023, 3:24 PMBecause nothing in the dags directory is importing from the django code. How can this be resolved?I think this work https://github.com/pantsbuild/pants/pull/17931 has a good chance of solving. That's available in 2.16.0.dev5+: https://pypi.org/project/pantsbuild.pants/2.16.0.dev6/
thousands-plumber-33255
02/11/2023, 8:43 PMpants tailor
. Ignoring the python_sources
targets here would result in a long list that would be hard to maintain. The pants option seems to scale better, but that in turn would never generate other targets like python_tests
right?thousands-plumber-33255
02/11/2023, 8:47 PMenough-analyst-54434
02/12/2023, 12:41 AMIgnoring theI'm not following you.targets here would result in a long list that would be hard to maintain. The pants option seems to scale better, but that in turn would never generate other targets likepython_sources
right?python_tests
[tailor]
ignore_paths = ["airflow/dags/**"]
In combination with a single python_sources
target in airflow/dags/BUILD
that globs all python source files under airflow/dags
should allow you to:
1. Have a single 0-maintenance airflow/dags
target for other things to depend on.
2. Let dependency inference (1st and 3rparty) just work.
3. Allow you to use ./pants tailor
on all other portions of the repo as per normal.thousands-plumber-33255
02/12/2023, 6:22 AMairflow/dags/dir1/example_test.py
. With this structure I cannot utilize pants tailor
for generating pyhon_tests
targets in the future since I cannot ignore generating only specific targets right?thousands-plumber-33255
02/12/2023, 6:37 AM2.16.0.dev6
, but with ambiguity_resolution = "by_source_root"
I am still seeing my django files being included 👀 Any idea? Does that really apply here as it is django/spatial/**
with airflow/dags/spatial
and not django/dags/spatial/**
. @happy-kitchen-89482 Any thoughts here?enough-analyst-54434
02/12/2023, 6:48 AMfor generatingYou can do the same fortargets in the future since I cannot ignore generating only specific targets right?pyhon_tests
python_tests
- use 1 python_tests target that recursively globs in the airflow/dags
tree. So, again, tailor still works for all other parts of your repo. It does not work in airflow/dags
and you make that maintainable by using the strategy of declaring solitary targets in `airflow/dags/BUILD`that use recursive globs. Does that make sense or am I missing why that does not work?thousands-plumber-33255
02/13/2023, 10:48 AMthousands-plumber-33255
02/13/2023, 10:57 AMpython_sources(
name="dags",
sources=[
"dags/**/*.py",
"dags/**/*.pyi",
"!dags/**/test_*.py",
"!dags/**/*_test.py",
"!dags/**/tests.py",
"!dags/**/conftest.py",
"!dags/**/test_*.pyi",
"!dags/**/*_test.pyi",
"!dags/**/tests.pyi",
],
)
It warns as follows:
./pants --no-pantsd dependencies --transitive airflow:airflow
`105625.16 [WARN] Unmatched glob from airflow:dags's sources
field: "airflow/dags/**/*.pyi", excludes: ["airflow/dags/**/*_test.py", "airflow/dags/**/*_test.pyi", "airflow/dags/**/conftest.py", "airflow/dags/**/test_*.py", "airflow/dags/**/test_*.pyi", "airflow/dags/**/tests.py", "airflow/dags/**/tests.pyi"]`enough-analyst-54434
02/13/2023, 1:44 PMenough-analyst-54434
02/13/2023, 1:47 PMthousands-plumber-33255
02/14/2023, 3:05 PMFor the run case ... that's trickier. I think you'd need a small bit of wrapper code in your main that set AIRFLOW_HOME before calling into airflow and it would do so by usingWhat are the advantages/disadvantages for this case between the three options I see here? 1. Run the Pex file 2. Run the main file directly 3. Run the pex file in the same docker container I use to deploy Airflow usingand calculating relative to that. Using__file__
in this way will require venv mode so that all code - 1st and 3rdparty, shares the same (site-packages) root.__file__
docker_environment
thousands-plumber-33255
02/14/2023, 3:49 PMsubprocess_environment_env_vars
. Is this not intended?
2. I do that already with the Django app and I it uses the defined envs from the environment. The main file calls https://github.com/apache/airflow/blob/main/airflow/cli/commands/standalone_command.py#L287 and that cannot be found. I can resolve this by adding this to main. Can this be set by pants?
AIRFLOW_BIN_PATH = f"{os.environ['VIRTUAL_ENV']}/bin"
os.environ['PATH'] += ':' + AIRFLOW_BIN_PATH
3.Seems cumbersome, for example due to debug-adapter issues.thousands-plumber-33255
02/14/2023, 4:06 PM__file__
? Because something like this does not load my DAGS:
AIRFLOW_HOME = str(pathlib.Path(__file__).parent.resolve())
os.environ.setdefault("AIRFLOW_HOME", AIRFLOW_HOME)
That results in something like /tmp/pants-sandbox-nJNNNS/airflow
in which the dag
folder is present.enough-analyst-54434
02/14/2023, 4:08 PMenough-analyst-54434
02/14/2023, 4:09 PMenough-analyst-54434
02/14/2023, 4:10 PMthousands-plumber-33255
02/14/2023, 4:11 PMthousands-plumber-33255
02/14/2023, 4:11 PMenough-analyst-54434
02/14/2023, 4:12 PMenough-analyst-54434
02/14/2023, 4:45 PMexecution_mode="venv"
on a pex_binary
enables Pex's --venv prepend
runtime execution mode, which says: 1. create a venv 1st if not already done and re-exec into that 2. Prepend the venv bin dir to the PATHenough-analyst-54434
02/14/2023, 4:45 PM./pants run
- you may need 2.15.x though, run has morphed its impl a bit in the past several months.enough-analyst-54434
02/14/2023, 4:48 PMI am just really curious on why one would run the python source directly vs PEX?This is all a bit eye of the beholder convenience if I understand your question. It may be considered convenient to be able to
./pants run tab/complete/to/file.py
vs remembering a target name of a pex_binary
. The down side is you're not obviously testing production, which presumably is built from the PEX. Depending on how you feel about testing proxies for production vs ~actual production, you may have different feelings.thousands-plumber-33255
02/14/2023, 7:16 PMenough-analyst-54434
02/14/2023, 7:25 PMenough-analyst-54434
02/14/2023, 7:26 PMthousands-plumber-33255
02/14/2023, 7:31 PM