I'll try to explain what I expect to do with Pants...
# general
p
I'll try to explain what I expect to do with Pants, as I am having some issues in getting started (but maybe my approach is the issue). We are moving a bunch of python (and cython) project to a monorepo structure, I read some articles about different approaches and the way we want go for involves having multiple "project roots" in subfolders. Each project root has its own pyproject.toml, poetry.lock and set of dependencies. Does pants support this setup? I couldn't find any examples or documentation to answer this question, maybe I'm just blind though 😄. I tried this very quickly, so maybe I'm missing some obvious stuff,
./pants dependencies projects/example
doesn't return any dependency, I tried both
poetry_requirements
and
python_requirements
. If pants can indeed support this setup, do you have any examples of open source monorepo?
i.e. this example repo has the structure we'd like to use, all goals and targets managed with pants in the repo root
w
hey @plain-fireman-49959! welcome.
the https://www.pantsbuild.org/docs/source-roots page explains a few layouts: in this case, it sounds like you’d be going with the “Multiple top-level projects” setup.
what you want to see is
./pants roots
reporting each of your project directories, likely by configuring marker files in this case (as in the “Multiple top-level projects” example)
p
Yes, I did run
./pants roots
and I get all the roots I expect (some are "double nested" like
one.two.project_1
,
one.two.project_2
etc)
w
hm… not sure what you mean by double nested. but the longest root will match: so if you see an inaccurate nested root, you’d want to adjust your markers
once you get the roots right, you should see source dependencies at the very least reported by
./pants dependencies
. for the purposes of smoketesting though, i’d recommend running
./pants dependencies ${some_python_file}
, as then you can compare the dependencies to the imports more easily
once source dependencies look right, can revisit thirdparty dependencies
p
ok thank you. the roots look right to me, I'll setup a test project so I don't leak company stuff and i'll try this
👍 2
Ok I setup a quick project, these are the folder stucture, a.py file and
./pants deps
output, I should see that a.py depends on
one.b
correct? I ran
./pants tailor
before checking for deps
w
what does
./pants roots
report in this case?
p
I opened the root in vscode and it resolves
one.b
as I'd like it to.
Copy code
.
one/a
one/b
one/c
w
all of the roots below
one
are problematic
if you want to import
one.a
, etc, you’d want only
.
in that list
p
a, b and c folder are project roots, each one has a pyproject.toml etc
w
because the effect of
one/a
is that
one/a
will be stripped from the import path
leaving only
import A
p
I had
marker_filenames = ["pyproject.toml"]
in
pants.toml
(removed it for this screenshot)
w
@plain-fireman-49959: i think that maybe you’re halfway between “Multiple top-level projects” and the monorepo traditional
src/<lang>
layout
the former has lots of sourceroots, and strips the project directory off the front of the import path… the latter has maybe one sourceroot, and strips that off.
@plain-fireman-49959: does that make sense? if you want
from one.a import A
, then you don’t want
one/a
to be a sourceroot, as that will be stripped.
p
oh ok
and can
a
be a project with its own pyproject and set of deps?
The main reason is that I'd like to maintain our org namespace, as we have custom modules with generic names like "logging"
w
and can 
a
 be a project with its own pyproject and set of deps?
mostly: when dependencies overlap between projects, Pants will report them as “ambiguous”, and you’ll need to declare them in
BUILD
files. so it will work, but IMO it’s more of a transitional state on the way to having fewer sets of thirdparty deps declared, and only when different parts of the repository need different deps
(…and by dependencies here, i mean either “firstparty declared packages/symbols” or thirdparty dependencies)
for your logging case: if two projects declare a
logging
module, that will also be reported as ambiguity, and so those projects will need to explicitly declare those dependencies
p
ok I think I am somewhat confused: I imagine that by discovering dependencies by looking at imports as pants does, I can have a "root" pyproject and run goals for each subproject by "downloading" only its set of dependencies, I can package and distribute them independently with only their dependencies. The main problem I'm having though is that we mostly run all this project from one monolithic entry point/service. With the multi-repo setup we currently have we have to deploy new versions of the monolith everytime we release a dependency.. I know this can be managed as a huge standalone project but we like the idea of running specific developer workflows quickly for each subproject and have the option of releasing independent version if we need to use them in random microservices
I'll try to understand my use case a bit better and try some different structures before I go ahead and propose the changes, thank you very much for you help @witty-crayon-22786, everything is now clearer!
w
I imagine that by discovering dependencies by looking at imports as pants does, I can have a “root” pyproject and run goals for each subproject by “downloading” only its set of dependencies, I can package and distribute them independently with only their dependencies. The main problem I’m having though is that we mostly run all this project from one monolithic entry point/service.
yea, all of this is correct. you will be able to run tests using only the relevant subset of the dependencies, and only when those dependencies have changed. it’s fine to then have a smaller number of
pex_binary
targets: i.e., things that are actually deployed
so
./pants test one/a/some/specific/file.py
will run with the minimum dependencies of that file, while packaging your monolithic deploy binary will pull everything in.
but note that one of the advantages is that you mostly don’t need to package/distribute your libraries: if your primary goal is to consume them in the monolith, then you don’t need to publish them anywhere. the
pex_binary
for your monolith will depend on them, and always be up to date
p
yes packaging and distributing would be a small and infrequent use case for us!
I'm more and more convinced that is the correct approach for us, I just need to find a project structure that makes sense for us in terms of splitting code/concerns up. again thank you!
w
sure thing!
h
To follow up on Stu's comments about third-party deps and multiple
pyproject.toml
, check out https://www.pantsbuild.org/docs/python-third-party-dependencies for Pants's conceptual model of third-party dependencies and how it creates a single "universe" of dependencies that any file can pull from
w
mm, i also realized that i missed that your layout was
one/a/a/a.py
… if you want to import as
from a.a import A
, then you would need a source root at
one/a
(
one/a/a/a.py
has the source root chopped off to get
a/a.py
, which would be imported as
from a import a
or
from a.a import A
). but anyway: just think of the source roots as the prefix to chop off before calculating the import path.
👍 1
p
@hundreds-father-404 yes I had a good look at the part of the docs, I now understand why you would want to define "globally", thanks!
❤️ 1
just think of the source roots as the prefix to chop off before calculating the import path.
This explains it perfectly! So if I want to keep the org's namespace I could do something like
src/org/a/main.py
src/org/b/lib.py
,
src
would be a source root and I would be able to
from org.b.lib import lib_b
. There will be one
pyproject.toml
at root level and each target (
a
and
b
) can have
BUILD
files for testing, packaging, linting, etc. I think it makes sense, I'll try this setup today, thank you all!
👍 1
❤️ 1
h
Great! If you haven't already, we recommend running ./pants tailor to generate those BUILD files
👍 1