Hi! Does anyone here have a good structure/pattern...
# general
s
Hi! Does anyone here have a good structure/pattern for a 
Python + Docker + GCP
 Monorepo architecture!? Curious to see how others have done this 🙂
a
following.
h
I don't think we have a public example. Those monorepos tend to be private. But it makes sense to create one.
What are the kind of questions that would help answer?
s
How the build generalizes when adding more python services. For example, I'd like to create a monorepo with the structure:
Copy code
monorepo/services/python/service1
monorepo/services/python/service2
monorepo/services/python/service3
monorepo/services/python/service4
monorepo/services/python/service5
Each with independent requirements, maybe some shared code (like a database client class, how would that work, where would it live, etc...) Lastly, I'd like to write tests, and dockerize them, and run a docker compose at the top level to spin up all services at once. I think if an example repo is there that achieves those goals, or at least shows the file structure, that would be a HUGE win for any new developer using pants!
w
the nesting in there is a little unclear (i’ll go edit), but it describes a few different layouts.
s
Yes, still had a couple of questions 1. What if there are multiple projects under python? Does everything stay the same otherwise? 2. Do I just stick a docker file in each of the projects as if it was a single repo all by itself?
w
@shy-match-55049: re: 1) are you referring to a particular pattern from that page?
re: 2) yes: your docker file is probably associated with a particular
pex_binary
s
I was thinking the src/<lang> setup pattern
w
the relevant bit of the
src/<lang>
pattern is the
src
token, which you don’t have in your example above
h
One way to do this is to have a single source root (
src/python
if you think you might have other languages in the future, or just
src/
or even the repo root itself if not). Then you can have all the code live under
<srcroot>/myrootpkg/
with shared code in, say,
<srcroot>/myrootpkg/util
,
<srcroot>/myrootpkg/db
and so on.
And then the services live in
<srcroot>/myrootpkg/service1
, ``<srcroot>/myrootpkg/service2` etc
each of those could contain the service-specific code
and the BUILD file stuff necessary to build and deploy them
For tests, I like to have tests live in the same directory as the code they test, so
<srcroot>/myrootpkg/util/foo_test.py
next to
<srcroot>/myrootpkg/util/foo.py
and so on
Pants will do the right thing
s
I see! I'll tinker around and come back with questions 🙂
Super cool, thanks
h
but you can also have a parallel test tree if you prefer
s
Is that not difficult with python? Even for the IDE to recognize?
Parallel tree I mean
h
Other tools push you towards a parallel tree because they don't know how to exclude the tests from the package when you package and deploy
but Pants is good about that sort of thing
👍 1
BTW what @witty-crayon-22786 was referring to was that Pants automatically recognizes that
src/<lang>
is a source root, just as a convention. You can omit the
src/
part but then you have to manually tell Pants about the source root.
And re external dependencies, we encourage a single
requirements.txt
for the repo, since Pants knows how to take just the subset of deps you actually need for a given binary
and having a single one ensures that you don't end up with version conflicts
s
Oh, interesting. I was trying to see if I could do multiple requirements, but that makes a lot of sense
h
So the
requirements.txt
represents the universe from which dependencies are chosen as needed, but only the actually needed ones are used.
So there's no need to manually worry about that
s
Amazing
h
You can have multiple
requirements.txt
, even conflicting ones
But then there is more manual work in avoiding conflicts
s
In terms of building dockerfiles then, would it be better to keep them separate from the actual service roots, since they need to read the requirements.txt? How would that work in a clean, organized way?
a
afaik, the dockerfile doesn't need to see the requirements.txt pants builds the
.pex
file, and then you bung that in a dockerfile with
ENTRYPOINT [foo.pex]
👍 2
w
Pants uses PEX by default, which is a self-contained python artifact (essentially a relocatable virtualenv)
s
Oh, interesting. How do I know where it will output? Is it just in the directory with the code itself?
a
in fact, the whole docker thing becomes a bit of a joke, because it's probably just
Copy code
FROM python:3.8
COPY artifact.pex /bin
ENTRYPOINT ['artifact.pex']
RUN ['--run-stuff']
💯 1
w
@shy-match-55049: it goes into `dist`: see https://www.pantsbuild.org/docs/python-package-goal
s
I see. Still having a little bit of trouble seeing how this generalizes to multiple docker images, for example I have:
Copy code
monorepo/services/python/service1/Dockerfile
monorepo/services/python/service2/Dockerfile
monorepo/services/python/service3/Dockerfile
monorepo/services/python/service4/Dockerfile
monorepo/3rdparty/python/requirements.txt
Would each of those
Dockerfile
's just be
Copy code
FROM python:3.9.6-slim
COPY service{n}.pex /bin
ENTRYPOINT ['service{n}.pex']
RUN ['--run-stuff']
w
correct
s
That's pretty neat
a
now you just need to write a plugin that takes
pants containers
and runs
docker build
😅 2
(which is on my list of 'maybe I should do that', but I'm still trying to find time to work out how to make
pants publish-packages
run twine)
w
we have a very venerable ticket for docker support: https://github.com/pantsbuild/pants/issues/2648@ambitious-actor-36781: it sounds like you might want something a little different: perhaps because your host environment is similar-enough to the docker environment?
a
Unsure. It looks like that ticket would be relevant. there's two (three?) different things. 1. a way to publish the output of
pants package
to a repository (currently working on this) 2. A way to build pants targets into a docker image (e.g.
docker_image(name='a_service', dockerfile='src/python/a_service/Dockerfile', dependencies=[':a_service_binary'])
) 3. (?) take output of #2 and run it through #1 so it pushes the images to a container registry.
w
yea, absolutely.
the complexity of 2 depends on whether your build environment is similar to your deploy environment: if it isn’t, then the entire build (rather than just the final
pex
construction) might need to take place inside the docker container.
but yea, those three points are all things that we’d like to support on HEAD
@ambitious-actor-36781: maybe worth a new thread for further discussion of the docker aspect.
a
you've got to say "not our problem" at some point right? You could just say "ok... we've put a
pants
image on the Docker registry, and you can just do your entire build inside there"
w
true!
yea, lots of potential angles to what “supporting docker” means.
s
Found a docker integration
a
That'll be for 1.x
h
AFAIK there are no public 2.x plugins for docker , but several private ones
we'd like to have docker support in the pant core, as Stu mentions. The main issue has always been that it means different things to different people, and so it's not clear what to implement.
c
Specifying a base images and the executable target to run it in?!
w
@clean-night-52582: pants runs a lot of executables. so running only the root process (e.g. the one that creates the pex) is not likely to be sufficient: the resolve(s) would happen as a separate step. see the most recent comment on that ticket