Perhaps I'm approaching the problem from the wrong...
# general
a
Perhaps I'm approaching the problem from the wrong direction here, but according to the docs:
Copy code
A loose layout PEX is similar to a packed PEX, except that neither the Pex bootstrap code nor the dependency code are packed into zip files, but are instead present as collections of loose files in the directory tree providing different caching and syncing tradeoffs.
I'm trying to mimic hot-reloading for local development by voluming in parts of my source code in Docker to the PEX directory on the container and restart it, instead of rebuilding the entire PEX and then restarting the container. I'm able to volume in the source code and have it present after a restart, but it doesn't seem like Gunicorn is picking up on the changes nonetheless. Is this a gunicorn-specific limitation or a PEX-limitation? Note that I understand that achieving this wouldn't give me the fine-grained caching and dependency slicing that Pants normally offers, but it's a tradeoff I'm willing to make for near-instant reloads when debugging a stubborn issue locally.
Update: Turns out it was Gunicorn-specific. I was able to get it working with volumes by using the
--reload
option. Now all that's missing is the ability to run
pants package
in multiple terminals concurrently without taking a massive performance hit. Hopefully that will be supported in the future. Perhaps someone who understands the underlying Pants-engine can tell me if this: https://github.com/pantsbuild/pants/issues/7654 issue would solve that problem.
r
I think you can run
pants package ::
and it should build all packagable items concurrently while ensuring any dependency on common targets is built first.
a
Yes, that was the conclusion when I last asked as well. The performance when doing that is great, but it doesn't pair well with additional tooling such as https://tilt.dev which uses directives such as:
Copy code
custom_build(
    ref="my-service",
    command="pants package projects/my-service/Dockerfile",
    deps=["projects/my-service"],
    live_update=[
        sync(
            "projects/my-service/my_service/",
            "/bin/gunicorn/my_service/"
        ),
        restart_container()
    ]
)
To build and deploy your service in a swarm / k8s environment. Multiple `custom_build`'s would trigger multiple calls to Pants.
👍 1
e
You shouldn't need a loose PEX for this. Just https://docs.gunicorn.org/en/stable/settings.html#reload-extra-files +
PEX_INHERIT_PATH=prefer
+ `PYTHONPATH=/volume/mount/of/src`: https://pex.readthedocs.io/en/v2.1.145/api/vars.html#PEX_INHERIT_PATH
Or a variant
pex_binary
target with this false: https://www.pantsbuild.org/docs/reference-pex_binary#codeinclude_sourcescode +
PEX_EXTRA_SYS_PATH=/volume/mount/of/src
Either should pick up and use your volume mounted source with the PEX.
PEXes are pretty flexible. Your use of a loose layout is creative but definitely not the intended purpose. So if you run into issues with that, you have these other options at any rate. And there are others too! including installing the PEX as a venv in the image.
😄 1
a
I'm not sure I understand how you'd set PEX_EXTRA_SYS_PATH, @enough-analyst-54434. Just directly on the command line when you build the PEX? It doesn't seem to be a recognized option in Pants. Nor does building with it on the command line allow gunicorn to see the module in question, from what I can tell.
Copy code
File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'controller_items'
As a dummy example, we have a simple FastAPI app named `controller_items`:
Copy code
pex_binary(
    name="gunicorn",
    script="gunicorn",
    dependencies=[
        "//:root#gunicorn",
        "//:root#uvicorn",
        "//:root#psycopg2-binary",
        "projects/controller-items/controller_items",
    ],
    output_path="controller-items",
    include_sources=False,
)

docker_image(
    name="controller-items",
)
The
pex_binary
has an entrypoint of
gunicorn
in order to be able to point to the module in question. The relevant compose section looks like this:
Copy code
controller_items:
    image: library/controller-items
    command: |
      /bin/gunicorn controller_items.asgi:app
      --bind 0.0.0.0:${CONTROLLER_ITEMS_PORT}
      --reload
      --worker-class uvicorn.workers.UvicornWorker
    environment:
      DB_NAME: ${DB_NAME}
      DB_PORT: ${DB_PORT}
      DB_USERNAME: ${DB_USERNAME}
      DB_PASSWORD: ${DB_PASSWORD}
      DB_HOST: ${DB_HOST}
    ports:
      - 8000:${CONTROLLER_ITEMS_PORT}
    networks:
      - internal
    volumes:
      - ./projects/controller-items/controller_items:/srv/controller_items
And the Dockerfile for completeness:
Copy code
FROM python:3.10.12

ENV PYTHONPATH=/srv

COPY controller-items /bin/gunicorn
e
You set it exactly like you set PYTHONPATH. In the image, pointing at the source code mount point.
👍 1
a
Oh.
h
FWIW, if you set
restartable=True
on a file then
pants run path/to/file.py
will auto-reload when you change files (see https://github.com/pantsbuild/example-django/blob/main/helloworld/service/frontend/BUILD for a gunicorn example). To run this in a container though you'd have to volume the repo itself into the container and run pants in it. But possibly worth it while debugging?
👍 1
a
These are all fairly interesting approaches. The one thing I liked about the
loose
layout was that it required no changes to the underlying Dockerfile. I have a single template.yml compose-file that is populated with environment variables for dev / staging / prod to ensure the setup is as close to identical as possible. Before migrating to pants I had a small script that parsed the template and converted the docker command to run
uvicorn
instead of
gunicorn
inside the template, and set up a bind volume to the location on the host. This allowed me to switch back and forth seamlessly without editing the Dockerfile for debugging. I'd love to achieve something similar here as packaging and restarting the container often takes ~15+ seconds.
l
@acoustic-library-86413 how did you end up configuring your repo? Were you able to achieve autorestarts of the packaging and container in a short amount of time? I'm looking to do something similar https://pantsbuild.slack.com/archives/C046T6T9U/p1697747765394509
a
I'm still using the loose layout, but mostly to ensure I have the source code plainly available when inspecting my services while they are running. However I have a working prototype with the
loose
hack that does rebuild on code changes. The main problem is that
pants
is not great at running multiple invocations concurrently. However it is excellent at multiprocessing in a single invocation. Therefore with my
tilt
use case I need something like the following script:
Copy code
import os
import subprocess
import time

if not os.path.isfile("build.tmp"):
    open("build.tmp", "w").close()
    print("Building images...")
    
    # Perform multi-project build here
    subprocess.call("pants package ::")

    os.remove("build.tmp")

while True:
    if os.path.isfile("build.tmp"):
        print("Build in progress, waiting...")
        time.sleep(1)
    else:
        break
In order to build all services on startup without taking a major performance hit. I can then use the
live_update
instructions for hot reloading / restarting the server where necessary. It's worth noting that since my original attempt at this https://docs.docker.com/compose/file-watch/ has been introduced, and it would be really interesting to try this out as a tilt-alternative for live reloading. I also feel like I got off easy by being able to attach a debugger to the process running in Docker using
debugpy
, as I was mostly using this functionality for debugging purposes anyway.
l
Aha! Thanks! So just to confirm, whenever there's a code change you're running
pants package ::
to rebuild the pex binary in loose format, with the pex binary volume-mounted to the container?
a
You could, yes. I've opted to create something in Tilt to rebuild and restart a specific service on-demand instead. I have an example repository at home which I can make public where I showcase an example of this. What types of services are you running, is this something that has built-in hot-reloading support or would you also need to restart your container on code changes?
l
I'm building a Python Django server, so yeah there's file-watching built-in with the Django dev server. Any examples you have would be most helpful!
Looking at the compose watch feature further, I'm not sure how it would support this workflow. When a change is detected watch will either copy files or rebuild the image. I don't see a way to run a custom script on a change. Perhaps I'm missing something though.
a
My understanding was that you would run your application with a format that allows for partial / full sync, and with hot-reloading on changes enabled. The compose watch feature allows both rebuild(+restart), sync and sync+restart as "actions". A simple "sync" would make Django trigger the hot-reload on changes?
👀 1
h
FWIW this repo has Django examples: https://github.com/pantsbuild/example-django
1
Including info about how to use pants's own filewatching instead of Django dev server's autoreloader
a
Yeah, that works if you have the option of running your project outside of Docker!
☝️ 1
l
I was expecting to have to rerun
pants package ::
on each change code change in order to build the pex binary that is then synced to the docker container, is this a correct assumption? If so I'm not sure how to trigger the package goal on code change.
aha! I've got something working: 1. Build pex binary with a loose layout and without sources:
Copy code
pex_binary(
    name="manage",
    entry_point="manage.py:main",
    dependencies=[
        ":src",
    ],
    include_sources=False,
    layout='loose',
)
2. Run docker container with two volumes, one mounting the packaged pex folder, and a volume of the host source code mounted to the source directory of the mounted pex folder:
Copy code
---
version: '3'
services:
  web:
    image: myapp:latest
    ports:
      - 8000:8000
    volumes:
      - ../../dist/myapp/manage.pex:/code/app
      - ./my_django_project:/code/app/my_django_project
    env_file: .env
    command: "python /code/app runserver 0:8000"
I'll put together an example repo sometime soon. Thanks for your help! Also if you see any problems with this setup I'm all ears.
🎉 2
a
Yes, this is effectively the same thing as I'm doing. My idea about using
Compose Watch
was to help you avoid the volumes, since this allows you to run the same compose-file in production without creating volumes on your swarm host or runner.
1
a
@lemon-eye-70471 do you have an example somewhere? Can you share the Docker file for myapp? I assume it contains all dependencies?
l
Oops, here is the example project I had as of a year ago: https://github.com/ezbc/monorepo-pants-docker-django/tree/main. I can't vouch at the moment whether or not it functions but it looks like it follows the setup we discussed in this thread. I'll try to find the time to clean the example repo up soon.