Hi everyone :slightly_smiling_face: love the tool ...
# general
l
Hi everyone 🙂 love the tool and am looking to migrate our current python projects to a monorepo structure. I'm looking for good example repositories and ideas on "best practices" when designing a monorepo (esp. with
pantsbuild
in mind). I found a couple example repos, e.g. https://github.com/pantsbuild/example-python https://github.com/pmatev/demo-python-monorepo But I have a couple other questions, that I'll ask here. If someone can advise that'd be awesome 🙂 Is it ok to have inter-project source code dependencies, or is it better to extract all shared source code to some
/lib/
projects? Is it better to keep projects in
/root/projects/
or just lay them all out in
/root/
? How well does PyCharm work with a monorepo project structure? Is it simple enough to resolve import paths, virtualenvs per project etc.
f
Is it ok to have inter-project source code dependencies, or is it better to extract all shared source code to some 
/lib/
 projects?
It's gonna be up to you and your team(s) to decide what works best. If you're primarily building mostly unrelated applications, then putting shared code in
/lib
could make sense, but if you could also choose other ways to go about it depending on the character of the code you want to manage here, and how you want your concept of code-ownership or stewardship to work. I used a tree like:
Copy code
/$company/algorithms/...  # Classic CV algorithms and wrappers
/$company/applications/...  # CLI applications
/$company/backends/... # Backends for externally provided features
/$company/blueprints/...  # Microservice blueprints for shared routes
/$company/launchers/...  # Code to help applications launch in containers and dev environments
/$company/models/...  # CV models and associated code
/$company/services/... # Named microservices
/$company/tools/...  # tools used to work with code
/$company/types/...  # Shared type names for communicating across layers
we didn't have any particular dependency graph enforcement in place, but typically
services
and
applications
and
tools
only depended on the other parts, which were mostly independent-ish library elements, and
types
was at the bottom, providing some common vocabulary and functions to encourage a more uniform way of communicating data between components. This was set up this way to encourage greater code re-use. So i guess rather than marking shared code as
lib
, we assumed library code, and then made app/service code explicitly marked. Another way to do divide a monorepo is project-first, which just puts each project in its own directory. It's also a valid choice. All this is going to come down to what you think will work best for your team and environment, and what kinds of behaviors you want to encourage.
h
Welcome, and glad you're liking Pants! I completely agree with Josh on this. I think to promote code re-use it's often good to assume that all code is "lib" code unless it's under an
apps/
directory or something similar.
You do want to try and avoid circular dependencies though. Pants will work with them, but they are not ideal.
f
there probably is a way to do script up some cycle detection if you want to enforce avoiding that
l
Guys, thank you so much for the advice! @flat-zoo-31952 that perfectly answered my question
☺️ 2
We were considering doing project-first like you said (since we currently have multirepo, it would make migrating easier) The team are concerned about versioning e.g. if a project needs some urgent update which ends up breaking some other project in the process. At the moment we can version stuff and deploy to pypi. It's another element where I don't understand best practice (keep it on a separate branch? In some `v1`/`v2` subdirectory?)
f
The team are concerned about versioning e.g. if a project needs some urgent update which ends up breaking some other project in the process.  At the moment we can version stuff and deploy to pypi.  It's another element where I don't understand best practice (keep it on a separate branch?  In some `v1`/`v2` subdirectory?)
The short answer is to avoid this situation as best you can. Monorepos are really at their best when they work in service of maintaining a single source of truth in a single place. If you start long-term maintenance of branches, then you're just moving multirepo problems into a single place. Consider gating features with flags rather than branch-based solutions. Namespace solutions like v1/v2 are best used for when you need to rewrite a whole feature or library that violates fundamental assumptions built-in to the previous version. I'd say you can use branching for ephemeral hotfix-kinda stuff or as part of a release cycle plan (as pants itself does), but you don't want to get into a situation where you're maintaining some forked version of your code long-term, as this will come around to bite you