Hey I’m new to pants and have never worked with a ...
# general
c
Hey I’m new to pants and have never worked with a monorepo tool before. I am starting a monorepo that currently has folders
src/
and
projects/
where
src/
contains source code and
projects
contain individual projects that may or may not source from
src
. I was thinking that each project either builds an image from the Dockerfile within the project directory, or from the root directory base
Dockerfile
that will likely be compatible across multiple projects. What are some steps I should take to integrate python builds / services that are currently deployed in production using docker, but integrating pants so I can source shared code (versioned) across projects. Example: Say I update the shared code
~/src/utils.py
from version 0.1 to 0.2, but I don’t want to have to worry about maintaining project_1 once it’s been deployed (e.g. if
utils
version 0.2 breaks project_1, so project_1 requires utils version 0.1) For example
Copy code
project_1:
  run.py <- sources from /app/src/utils.py version 0.1
  Dockerfile
  requirements.txt
  ...

project_2:
  run.py <- requires /app/src/utils.py version 0.2 since utils.py was updated
  Dockerfile
  requirements.txt
  ...
h
I would recommend maintaining a single resolve (i.e.
requirements.txt
) for the entire repo and not try to bifurcate for each project.
The structure you've proposed above looks a lot like what poetry tries to encourage for structure. It's rather limiting in terms of things you can do and sharing code that have different resolves (i.e. different requirements files) becomes a challenge at the BUILD file level imo.
With that said, there's no reason you can't do
src/
and
projects/
as you've described!
It can be a great way to manage development of the repo; keep shared things in
src
and specialize/use in
projects
c
I’m not sure I understand how one requirements.txt file would work. Project_1 requires python version 3.9.15 with a set of requirements, and project_2 requires python version 3.10.12 with a new set of requirements
h
well that's a different story
requirements.txt doesn't even need to come into play for that issue
c
what i’ve been doing is put a Dockerfile for each individual project that deviated from the base Dockerfile
h
I might not be familiar with enough of the settings, but I'm not sure pants lets you work on different portions of a repo with different interpreter versions.
So presumably you could put
__defaults__
in the right place at the top level of each project
That's still not a deal breaker for having a single requirements file.
There are packages that could be valid for both 3.9 and 3.10 use
c
for sure a single requirements file would break my code
i have tried it because project_1 requires conflicting packages from project_2 and project_3
h
Then you're forced to use different resolves and your challenge will come when you try to share source code from
src
You'll have to use the
parametrize
functionality to indicate that your source code can be used among one of multiple resolves.
c
hmm ok thanks i’ll read through these
h
I'm not sure I follow the example in the original post. It kind of sounds you're thinking about all of these things as separate entities which isn't really the point of a monorepo.
One of the huge values of a monorepo is updating everything together. You're not using an older portion of your monorepo elsewhere in your monorepo. You're using what's tracked in source control all together. If you want to version portions of the monorepo, that sounds more like what a polyrepo setup is intended to enable.
And if you did want to do that, there's nothing stopping you from using pants in each smaller repo in your polyrepo setup. It just happens to be a common use case for monorepo's because the problems pants solves become exacerbated in a monorepo setup.
c
I was exploring a monorepo primarily because of sharing code across projects. It’s a headache to maintain 5 different versions of the same class / function. Putting those in a shared
src
folder was my initial reason for diving into a monorepo.
Reason I asked about different versions is because I updated one of the function in
src
and it broke project_1. I don’t want to have to worry about maintaining 50+ different projects once they’ve been deployed to production
h
I was imagining that all the things in
src
would be one repo,
project_1
would be another repo, and
project_2
would be another still.
c
you can look at it that way, i just ended up putting projects as it’s own directory
(not inside src)
h
I don’t want to have to worry about maintaining 50+ different projects
This is why it kind of sounds like a monorepo isn't the right solution for you. This is precisely one of the things a monorepo architecture aims to do; provide immediate visibility when you break compatibility with downstream consumers.
c
when i was reading into it I thought you can version control the source? Something like
Copy code
python_binary(
    name='project_1',
    source='run.py',
    dependencies=[
        '//src:shared_utils==0.1',
        # Add other project-specific dependencies if any.
    ],
)
and then
Copy code
python_binary(
    name='project_2',
    source='run.py',
    dependencies=[
        '//src:shared_utils==0.2',
        # Add other project-specific dependencies if any.
    ],
)
I thought monorepo sounded like a great solution for what i’m looking for. I’m looking to consolidate my code into one repo. The separate projects will almost always source code from the root
src
directory. Once deployed to production doesn’t pants store the version of each py library sort of like git? If utils==0.1 is set then shouldn’t it source the old version of utils?
h
I haven't seen anything like that. My experience is that you want what's currently in
src
to be compatible with what's currently in
projects/*
But perhaps Pants has some capability to do what you're suggesting that I'm not familiar with.
c
interesting, ok so if let’s say I update
numpy
and no the function
np.bool
is no longer supported / throws an error. Say a line of code will error out in projects/project_1 because it calls
np.bool
. Will pants detect that?
or do i have to test the code myself some other way to find out that
np.bool
is no longer supported?
h
If you have unit tests with coverage in the areas where
np.bool
is called, then your
pants test
commands would fail.
a
There is a bit of documentation about managing third-party dependencies with multiple resolvers here: https://www.pantsbuild.org/docs/python-third-party-dependencies If you do end up maintaining an individual resolve file for each of your projects, then you could, atleast in theory, have some projects use a pinned version (that you push upstream on commit), and have other projects import from the codebase directly, always using the latest version. I'm not sure it would be pretty or easy to reason about, but I think it would be possible.
👀 1
c
interesting, thinking maybe I can try something like this for packages not pushed to pip
Copy code
dependencies = [
    utils==git+http://<utils url>.git@commit
]