How do you guys manage `.venv` and shared depende...
# general
b
How do you guys manage
.venv
and shared dependencies between projects (multiple venvs per project) and also building images per project (multiple
Dockerfile
(s) per project)? Does anyone has a complete example that they could share?
Using
pants package prj/projectA/Dockerfile
seems to not copy the entire context directory
s
Dockerfile context will only have the dependencies of the docker_image target
b
I have a project inside a monorepo with following file structure
Copy code
hunch/projects/followed_users_post
├── BUILD
├── Dockerfile
├── __init__.py
├── requirements.txt
└── src
    ├── BUILD
    ├── __init__.py
    ├── forecast.py
    └── preprocessing.py
and my build file is like this
Copy code
python_sources(name="src", dependencies=["hunch/libs:libs"])

python_requirements(
    name="reqs",
)

docker_image(
    name="docker", 
    context_root="./", 
    source="Dockerfile", 
    dependencies=["hunch/projects/followed_users_post:src", "hunch/libs:libs"]
)
when I do
pants package hunch/projects/followed_users_post
, it complains that
requirements.txt
doesn't exist with the error
Copy code
The following files were not found in the Docker build context:

  * requirements.txt
My dockerfile is simply
Copy code
FROM python:3.11.9-slim as base

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV PIP_NO_CACHE_DIR=off
ENV PIP_DISABLE_PIP_VERSION_CHECK=on
ENV PIP_DEFAULT_TIMEOUT=100



RUN pip install --no-cache-dir --upgrade pip

COPY requirements.txt /app/requirements.txt
RUN pip install --no-cache-dir -r /app/requirements.txt
I am not sure, what I am doing wrong? I have mentioned the dependencies in the docker image target but still it complains file doesn't exist
If I use
files(name='libs', sources=['requirements.txt'])
, and mention this as a dependency in docker image along with others, it only copies to
Dockerfile
and
requirements.txt
to the directory. (I also add the following line in Dockerfile
COPY . /app
If I change
files
to
files(name='libs', sources=['requirements.txt', '**/*.py'])
it adds the python files too, to the image. Not sure if this is the right approach?
s
By design you are supposed to use something like pex_binary and then add it to your dockerfile
b
If I make the binaries, do I also have to change the import statements to refer to the correct path? Currently my `PYTHONPATH`is set to monorepo root and in my project when I do
from libs.liba.pyfile import someClass
this works. With binaries also, this should work?
s
You don't need to set pythonpath manually, neither do you need to copy the sources into docker container, pex will do it all for you
b
No success with pex binaries 😕. I followed the same steps mentioned in the blog post shared above and created my
BUILD
file as follows:
Copy code
python_sources(
    dependencies=["hunch/libs"]
)

python_requirements(
    name="reqs",
)

pex_binary(
    name="fp_binary",
    layout="loose",
    execution_mode="venv",
    dependencies=["hunch/libs:libs_binary"],
    include_sources=True,
    include_requirements=True,
    include_tools=True,
    venv_site_packages_copies=True

)

docker_image(
    name="docker", 
    instructions=[
        "FROM python:3.11.9-slim as deps",
        "COPY hunch.libs/libs_binary.pex /libs_binary.pex",
        "RUN PEX_TOOLS=1 /usr/local/bin/python /libs_binary.pex venv --scope=deps --compile /bin/app",

        "FROM python:3.11.9-slim as srcs",
        "COPY hunch.projects.followed_users_post/fp_binary.pex /fp_binary.pex",
        "RUN PEX_TOOLS=1 /usr/local/bin/python /fp_binary.pex venv --scope=srcs --compile /bin/app",

        "FROM python:3.11.9-slim",
        "COPY --from=deps /bin/app /bin/app",
        "COPY --from=srcs /bin/app /bin/app",
    ]
)
but when I run the docker image, it throws the ModuleNotFoundError with packages mentioned in my
requirements.txt
. Also when I inspect
fb_binary.pex
in
dist/
folder, I don't see any source fiels or third party packages, just the build in libraries
s
b
I am creating two binaries, but the binary
libs_binary.pex
is in another place, i.e. I am creating it in separate build file. Should I create all the binary files of dependencies in same build file?
s
No, it's not required, but it's hard to debug the problem with only part of the setup, could you create a small repository to demonstrate the issue?
b
I understand @square-psychiatrist-19087 and I apologise for the confusion. That was bad on my part. Honestly, I feel like much of the issues I am running into are probably because of I am not doing the things in the "mono-repo" way. I am used to the poly-repo setup and much of my thinking of "how stuff should work" is influenced by that.
I have created a sample repo and also highlighted the issues I am running into, into this issue.
s
Is it required to set entry_point to pass in the files?
Pex binary is designed to behave like a binary. That means 1) it has to be built 2) the result of the built process is a single executable file Since the binary is executable, it has to run some code when executed, thus you have to specify an entrypoint to run it
coming from poly repo background, I am used to creating pyproject.toml
If all your libraries are in the monorepo managed by pants, then no, you don't need to create any files other that BUILD files. When you build pex binaries, pants will look at the imports and automatically include all the imported code into your binary
I cloned the repo, but I don't understand what I'm supposed to fix. What are you trying to run in the docker container? the forecast.py? Issue says nothing about it.
The important pieces are
root_patterns
and
pex_binary
b
🙏 Thanks a lot @square-psychiatrist-19087 for looking into the repo. You have been very kind. Your fix did help and now I am able to get it working in my main repo.
I was trying to run
forecast.py
and wanted to see if all the dependencies (including 3rd party) are actually included in the pex binaries or not.
It is working now, however I noticed one issue. when I have a same package mentioned in more than one project's
requirements.txt
file, pants warns that it couldn't correctly identify it's owner and then the dependency is not included in the pex binary. However if I remove the package and mention it only in one
requirements.txt
then it is included in pex binary. I am using
python-default
lock file. What's the recommended way of adding 3rd party dependency, when that third party dependency is required by different projects?
s
now I am able to get it working in my main repo.
I'm glad it's working, you're welcome!
What's the recommended way of adding 3rd party dependency, when that third party dependency is required by different projects?
If you're using a single resolve, just create a single requirements.txt with all the dependencies
b
If two projects depend on same package but different version? What should I do then? I am not facing this issue now, but I think it may happen in future, for example I start the project A today and it is using
pandas==2.2.2
, after few months some other team member starts another project which is using the latest version of pandas (could be
pandas=2.9.0
), I think the single lock file will throw an error then. In such scenarios, should I be creating separate lock files?
s
if you need 2 different pandas versions, then you need separate resolves:
Copy code
# project 1
python_requirements(name="project2-reqs", resolve="project1")

# project 2
python_requirements(name="project1-reqs", resolve="project2")