I am running into a similiar issue as described in...
# general
t
I am running into a similiar issue as described in https://github.com/pantsbuild/pants/issues/18913:
Copy code
vscode ➜ /repo (develop) $ pants --tag=ecr package jobs::
09:23:39.10 [INFO] Completed: Building dockerfile_parser.pex from <resource://pants.backend.docker.subsystems/dockerfile.lock>
09:23:42.57 [INFO] Completed: Building local_dists.pex
09:23:42.70 [ERROR] 1 Exception encountered:

Engine traceback:
  in `package` goal

IntrinsicError: Can only merge Directories with no duplicates, but found 2 duplicate entries in :

`__init__.py`: 1.) file digest=a494b1efc1b9c85f9cfc2bc06f9978f7a826ea8e4f9d913cf09587ddc598c74a size=34:

"""Comment in first init file."""


`__init__.py`: 2.) file digest=025adab8373820594f2a6fd5c29bcdb30a93dac3e90a9bcf577b5394de66cd57 size=37:

"""Comment in second init file."""
It should be noted that I just added those init files and there was no problem previously. What is going on here and how can this be resolved?
1
b
What’s the file structure for those files? what’s your pants.toml? And what’s the target definition of what’s being packaged?
t
pants.toml:
Copy code
[GLOBAL]
# info: change version to update pants
pants_version = "2.16.0"
colors = true
# increasing memory usage for for better in-memory caching and fewer restarts
pantsd_max_memory_usage = "4GiB"
# storing pants log and pants file in one place (persistent because of container volume mount)
pants_workdir = ".cache_and_logs/.pants.d"
pants_subprocessdir = ".cache_and_logs/.pids"
local_store_dir = ".cache_and_logs/.pants_cache"
unmatched_build_file_globs = "error"

# <https://github.com/pantsbuild/pants/issues/17851>
named_caches_dir = "/tmp/named_caches"

build_file_prelude_globs = ["pants_plugins/macros.py"]
# ignore nefino cli tool and top level dockerfile for now
# also ignoring pgadmin because no read permissions for pgadmin/config
pants_ignore = [
  "/.cache_and_logs/",
  ".*/", 
  "/dist/", 
  "/cli/third/md-click",
  "/Dockerfile",
  "!/django/locale/**",
]
# uncomment to do not use gitignore by default for file visibility
# pants_ignore_use_gitignore = false

backend_packages = [
  'pants.backend.python',
  "pants.backend.awslambda.python",
  'pants.backend.docker',
  'pants.backend.docker.lint.hadolint',
  'pants.backend.python.typecheck.mypy',
  'pants.backend.experimental.python.lint.ruff',
  'pants.backend.python.lint.yapf',
]

[tailor]
ignore_adding_targets = [
  "django:manage",
  "infra/apps/resources/django:docker",
  "infra/apps/resources/services:docker",
  "infra/apps/resources/replibyte:docker",
]
ignore_paths = ["airflow/dags/**"]

[python]
interpreter_constraints = ["CPython>=3.8.1,<3.9"]
enable_resolves = true
default_resolve = "default"
pip_version = "22.3"

[python-bootstrap]
search_path = ["/usr/local/bin/python"]

[python-infer]
#inits = true
init_files = "always"
string_imports = true
string_imports_min_dots = 0
# This setting will ommit warnings about dependency inference as there is a modele `api` in Django
# as well as one with the same name in alle service directories.
ambiguity_resolution = "by_source_root"

[python.resolves]
default = "lockfiles/default.lock"
tools = "lockfiles/tools.lock"

[source]
root_patterns = [
  "/services/*",
  "/libs/*",
  "/django",
  "/airflow",
  "/pants_plugins",
  "/cli",
  "/",
  "/jobs/*",
]

[docker]
build_args = [
  "VARIANT",
  "PYTHON_MAJOR_VERSION",
  "ENVIRONMENT",
  "IMAGE_TAG",
]
run_args.add = ["--rm"]
env_vars = [
  "DOCKER_BUILDKIT=0"
]
[docker.registries.ecr]
address = "localhost:4510"
default = true

[pytest]
install_from_resolve = "tools"
requirements = [
  "//:tools#pytest",
  "//:tools#pytest-icdiff",
  "//:tools#mixer",
  "//:tools#pytest-xdist",
  "//:tools#pytest-cov",
]
xdist_enabled = true
config_discovery = true
# db id env var set by pants for using multiple databases for parallel tests
execution_slot_var = "PANTS_EXECUTION_SLOT"

[test]
# set batch size smaller, so that caching is more efficient
batch_size=32

[environments-preview.names]
qgis_docker = "//services/qgis:local_qgis"
local = "//:local-environment"

[yapf]
config = "pyproject.toml"
install_from_resolve = "tools"
requirements = ["//:tools#yapf"]

[mypy]
config = "pyproject.toml"
install_from_resolve = "tools"
requirements = ["//:tools#mypy"]

[ruff]
config = "pyproject.toml"
install_from_resolve = "tools"
requirements = ["//:tools#ruff"]

[anonymous-telemetry]
# do not send telemetry data
enabled = false
BUILD file:
Copy code
pex_binary(
    name="sentinel_image_layer",
    entry_point="sentinel_image_layer.py",
)

docker_image(
    name="sentinel-image-layer-image-ecr",
    image_tags=["{build_args.IMAGE_TAG}"],
    tags=["ecr"],
    repository="{build_args.SENTINEL_IMAGE_LAYER_ECR_REPO_NAME}",
    description='["jobSentinelImageLayer"].["tag"]',  # path to the image tag in pulumi config, used in update-tag-workflow.yml
)

docker_image(
    name="sentinel-image-layer-image-ghcr",
    image_tags=[
        "{build_args.IMAGE_TAG}",
        "{build_args.ENVIRONMENT}-{build_args.IMAGE_TAG}",
    ],
    tags=["ghcr"],
    repository="sentinel_image_layer",
    registries=[
        "@ghcr",
    ],
)

python_sources(name="sources")
File structure. Both init files you can see here have been added.
b
Okay, so I think the
root_patterns
configuration here is causing this issue a bit. A root pattern indicates a parent directory, and pants expects the contents of those to be the entry point to importable modules. So
jobs/*
is saying that
jobs/sentinel_image_layer
is a source root, and an import statement like
import sameProjection
would pick something out of
jobs/sentinel_image_layer/sameProjection
... But, based on you adding the
__init__.py
files, maybe you're expecting
sentinel_image_layer/
itself to be the importable module? I.e.
import sentinel_image_layer.sameProjection
would be how one gets to that folder. With the root patterns as they are, any identical files/paths (such as two
__init__.py
files) within different source roots will end up being mapped to the same file when loaded into a pex binary, hence the conflict. If that's the case, maybe try removing the globs from
root_patterns
? (Source roots are a bit subtle)
t
I tried removing and it does not throw that error anymore but I get a few such warnings:
Copy code
If you do not expect an import to be inferrable, add `# pants: no-infer-dep` to the import line. Otherwise, see <https://www.pantsbuild.org/v2.16/docs/troubleshooting#import-errors-and-missing-dependencies> for common problems.
11:24:20.55 [WARN] Pants cannot infer owners for the following imports in the target jobs/spatial_import/landuse_spatial_import.py:

  * spatialImporter.SpatialImporter (line: 4)
How can I resolve those?
b
Now the "packages" are the direct subdirectories of
jobs
, e.g.
spatial_import
is the name of that package, so for an import like
from spatialImporter import SpatialImporter
you'll need to use either: • a relative import (only works when things are in the same package):
from .spatialImporter import SpatialImporter
(extra
.
) • specify the full new package path:
from spatial_import.spatialImporter import SpatialImporter
t
Still getting a warning even when I specify the full package path:
Copy code
12:18:22.90 [WARN] Pants cannot infer owners for the following imports in the target jobs/spatial_import/landuse_spatial_import.py:

  * spatial_import.spatialImporter.SpatialImporter (line: 4)
relative import seems to work though
e
@thousands-plumber-33255 what does
pants roots
say?
It should include
jobs
. If it does not, and you removed the
/jobs/*
line in your config completely instead of just removing the trailing
*
from the line; then that's the issue.
t
Copy code
vscode ➜ /repo(develop) $ pants roots
.
airflow
cli
django
libs/common
libs/statistics
libs/windtools
pants_plugins
services/downscaling
services/elevation
services/globalsolaratlas
services/highcharts
services/interpolation
services/mva
services/newa
services/qgis
services/qgis-error
services/roughness
services/weasyprint
services/weibull
e
Yeah, you removed lines instead of removing *s
t
I removed /jobs/* completely
e
Yeah, no Bueno
So add a
/jobs
entry and all should be good. For Python, source roots should == PYTHONPATH entries.
In other words, by "try removing the globs", @broad-processor-92400 meant removing the trailing
*
and not the whole line.
@thousands-plumber-33255 IIUC nefino.de has been using Pants for a while. If so, it's surprising a source root issue would only be flushed out this late in the game (getting them right is generally a 1st day or 1st week thing at the latest). Is this a new repo in the org?
t
No, its the same repo 😮
Looking good now, thank you!
But to clarify: We prepared this now for a while and just started using it. So some issues will come up now I guess. Might have another one that could be related