Hi, I have encountered mypy error. ```Duplicate mo...
# general
q
Hi, I have encountered mypy error.
Copy code
Duplicate module named "src"
Let me share my situation in the thread.
File structure
Copy code
- projects/
|- prj1/
  |- src/__init__.py
  |- BUILD
|- prj2/
  |- src/__init__.py
  |- BUILD
- pants.toml
pants.toml (part)
Copy code
[GLOBAL]
pants_version = "2.9.0"

[source]
root_patterns = [
  '/projects/*',
]

[mypy]
version = "mypy==0.910"
pyproject.toml (part)
Copy code
[tool.mypy]
python_version = "3.9"
ignore_missing_imports = true
disallow_untyped_defs = true
pants roots
Copy code
projects/prj1
projects/prj2
BUILD
Copy code
python_sources(
    name="pkg",
    sources=["src/**/*.py"],
)
h
Hello! What are your Python imports like? If you want to import
projects/prj1/src/app.py
, what do you use with
import
statement?
q
Hello! Our design is monorepo, so each project never share code each others.
Inside of the project, we import src as follows
Copy code
import src.hoge.hige
in the child dir (e.g. src/hoge),
Copy code
from .fuga import Fuga
Did I answer your question?
h
Yes, thank you! How are you running MyPy currently?
Also are your
__init__.py
files empty?
q
If I specify the pkg directly;
Copy code
./pants check projects/prj1:
it works without error, but if I run;
Copy code
./pants check ::
the error I mentioned above was raised.
Also are your 
__init__.py
 files empty?
Yes. Empty. no import is there.
e
That's the problem. You have multiple instances of the
src
package at different
sys.path
entries (source roots). In order to do that in Python in general you need a namespace package. There are 3 ways to create one: 1. If you're python3 only, just remove the empty
__init__.py
files. 2. Declare a namespace package in the
__init__.py
files. using
pkgutil
from the Python standard library. 3. Declare a namespace package in the
__init__.py
files. using
pkg_resources
from the
setuptools
project. For details on 1 see https://www.python.org/dev/peps/pep-0420 For examples of 2 & 3 see: https://www.python.org/dev/peps/pep-0420/#namespace-packages-today With one of those fixes in place, you then need to tell MyPy that you're using namespace packages. Pants itself does and the critical configuration is highlighted here: https://github.com/pantsbuild/pants/blob/2d9e7b5e0cfd5b288d5ed8fd8388574171f23d9c/pyproject.toml#L29-L32
q
Thanks John for sharing links. The error is a common problem of mypy if we use it from home directory. But what I expected to pants was since pants mentioned that pants can be used for monorepo, I thought mypy will be executed individually in each root when I run
./pants check ::
. But thanks for your answer, now I understood that mypy will be executed from parent repository by pants.
e
@quiet-painter-18838 I will point out that having a monorepo structure is orthoganal to having duplicated packages. Whether in a monorepo or in many repos, if you have the same named package living at multiple paths, you really need to make it a namespace package. Python and the python ecosystem expect this.
Out of curiosity, is
src
the actual shared package name or is that just an example placeholder you're using to describe your problem case (much like
foo
,
bar
& `baz`are often used as placeholders)? I ask because, generally seeing
src
as a package when someone is configuring Pants for the 1st time points to a configuration issue setting up proper source roots. Usually it would be the 1st directory under a
src
directory that is the root Python package.
q
Our design is
Copy code
- projects/
|- prj1/
  |- src/__init__.py
  |- BUILD
  |- pyproject.toml
|- prj2/
  |- src/__init__.py
  |- BUILD
  |- pyproject.toml
- pkg/
|- pkg1/
  |- pkg1/__init__.py
  |- BUILD
  |- pyproject.toml
|- pkg2/
  |- pkg2/__init__.py
  |- BUILD
  |- pyproject.toml
- pants.toml
pkg
holds shared packages (e.g. logger), and they can be installed by pip or poetry.
projects
holds webapi or workers, and they will install
pkg
, but not install other projects because we would like to keep projects as independent.
pkg
needs to be unique, so we intentionally use a unique namespace. But
projects
don’t need to use namespace because they are independent.
e
OK. Thanks for the details. You are definitely correct that its perfectly fine and even more clean from a pure Python point of view to not use namespace packages for these duplicate packages since they are in binaries that will not be published or otherwise depended on by other code. But you are also asking for a bit of trouble re-using a package name and not making it a namespace package. The sort of trouble you ran into here with Pants and how it groups files (today - could change tomorrow!) when it runs MyPy. You might even run into this sort of problem when Pants runs some other tool. I don't think we've ever considered anything but performance when choosing how to group files when we pass them to Python tools. I also think you're the first user to point out this sort of problem. In other words - this is not a common way to name things. It's safe to say we like to be able to accomodate ~any repo structure; but your decisions here on naming and layout and not using ns-packages is challenging!
q
Now I’m leaning towards the idea that it might be better to use a unique namespace for each project. BTW we have been inspired by this article in designing our repo structure.
Thanks again, and I will reconsider our structure again.
e
Now I’m leaning towards the idea that it might be better to use a unique namespace for each project.
Probably. For one, in an IDE when you go to import a type in prj1, say
src.util
, you are very likely to select the wrong
src.util
since it's probably the case that several other projects, say prj3 and prj16, also have a
src.util
.
You're welcome.
👍 1