Hey Team First of all thanks for developing Pants ...
# general
b
Hey Team First of all thanks for developing Pants 🙌! We are moving to a monorepo and think pants would solve quite a few of our issues But i'm having a bit of a hard time making it work for our structure. I have the below structure
Copy code
-- pants.toml
-- helloworld1
  -- src
    -- functions
      -- BUILD
      -- plus.py
-- helloworld2
  -- src
    -- functions
      -- BUILD
      -- plus.py
-- helloworld3
  --src 
    -- functions
      -- BUILD
      -- plus.py
      -- multiply.py
    -- function_tests
      -- BUILD
      --test_plus.py
In
helloworld3/src/function_tests/test_plus.py
I have
from src.functions.plus import foo
I need this import to be from the
helloworld3/src
path. When i run the test file test_plus.py I get this error
ModuleNotFoundError: No module named 'src.functions'
I'm also getting warnings like `The target helloworld3/src/function_tests/test_plus:tests imports
helloworld3.src.functions.plus.foo
, but Pants cannot safely infer a dependency because more than one target owns this module, so it is ambiguous which to use:`
['helloworld3/src/functions/core/plus.py', 'helloworld2/src/functions/core/plus.py', 'helloworld1/src/functions/core/plus.py'].
I have a pants.toml in the root folder which has
Copy code
[source]
root_patterns = [
  '/helloworld1/',
  '/helloworld2/',
  '/helloworld3/',
]
Is this the correct way of setting things up? What am I doing wrong? I think my understanding of pants targets and sources hasn't 'clicked' yet, hopefully you can also help to make that right 🙂 Thanks in advance! Obviously i've replaced the real filenames with fake names. If that is not enough then please do shout and I'll look into creating a stripped down example monorepo with this error on github.
e
To start, I'm taking your names seriously and there is a problem with them, so I'll point that out and you can let me know if that is a fake problem introduced by renaming to report the underlying problem or not: Notice you have 3 projects that all provide the exact same Python module,
src.functions.plus
- that's a problem generally before even involving Pants. Does that naming duplication actually reflect what you're trying to achieve? Python does support namespace packages where a package can have sub-packages and modules added to it from many locations on disk (
src.functions
is one of those here), but there is no such thing as a namespace module. Languages like c# have partial classes that allow something like this, but not Python.
The warnings you're getting are effectively warning you about what I said above.
But the problem is deeper than targets or Pants. It's a straight up Python problem with your projects providing duplicate modules.
b
Thank you for your reply. It does indeed reflect the actual structure. For more context, since these folders ( helloworld1, helloworld2, helloworld3) were separate repos before, we had the same structure replicated across our repos so that we always consistently import from src everywhere . Are you suggesting that this breaks down when you move to a monorepo structure even when helloworld3 doesn't use any code from helloworld2 and helloworld1? Can pants not assume that the src being referred to is the parent src ( and not another src from a different folder)? Perhaps I may be misunderstanding the whole point of monorepos in which case please point that out too 🙂
h
Yes, that is correct. In a mono repo, you can import code from anywhere in the entire repository. That might not make much intuitive sense at first, but is key to the benefit of a mono repo that it is much easier to share internal code https://blog.pantsbuild.org/the-monorepo-approach-to-code-management/
👍 1
b
Ah so until we've removed that restriction of not being able to import from other folders am I right in saying that we will likely not get much benefit from using pants?
e
Well, the problem here is nothing really to do with imports, its having the same module in multiple locations. Is
src/functions/core/plus.py
identical in all 3 locations?
If so, the common technique is extract it to its own single top-level "common" project. Naming may vary.
Then all 3 projects depend on that.
If not, and they all have different contents: + If unintentntional, that's bad and Pants can help you here by ~forcing factoring out that module to a single shared space. + If intentional - that certainly will give Pants problems.
👍 1
h
You can have the same packages (i.e., directories) multiple times, by using namespace packages, but you cannot have the same module (src.functions.core.plus) multiple times. As John says, this isn't a Pants thing but a Python thing. Even if you have this in separate repos they can still collide with each other if ever loaded in the same interpreter. So regardless of monorepos and regardless of Pants, you probably want to think about careful namespacing...
👍 1
r
So I ran into this issue when having a
tests
package in multiple python packages inside a mono-repo. I have tests at same hierarchy for 3 packages and inside tests folder I have some test resources which I am importing like
Copy code
from tests.testdata import some_data
It's not necessary that
testdata
exists inside every
tests
folder of these 3 packages though. What would be the pants way to handle this? I am having weird issues where pants can't find some packages etc, but not sure if this is connected or not.
h
It sounds like you have a naming collision, with the top-level
tests
package defined in multiple source roots?
💯 1
So you probably want the conflicting packages to be "namespace packages" (https://peps.python.org/pep-0420/) and you need to make sure the module names (
some_data
in your example) don't conflict
r
by the way how does pants infer which
tests
I am talking about? yeah every
source_root
has its own
tests
directory
h
Pants infers at the file/module level
so as long as the full module path
tests.testdata.some_data
is unique, you should be OK
👍 1
if it's not unique, you have a Python problem, not just a Pants problem