:wave: New to Pants and experimenting with its ado...
# general
f
👋 New to Pants and experimenting with its adoption in a large python-based monorepo. Is a BUILD file required underneath every source root? 🧵
I started with a single BUILD file at the top-level that's just:
Copy code
python_sources()

python_requirements(
    name="reqs",
    source="requirements.txt",
)
I've confirmed via
pants roots
that it is properly inferring all the source roots via my
marker_filenames
config, but
pants dependents path/to/some/py_file
is not yielding any output.
It's a very large code base so I was trying to avoid
pants tailor
(for now).
e
you can use
pants tailor src/python/somemodule/something::
to only tailor a piece of your repo at a time, if that's helpful?
f
Sure! I guess I was hoping for a concrete answer to:
Is a BUILD file required underneath every source root?
It sounds like yes...
e
Yes. Every target (file, and a few other things) must be represented in a build file. It is technically possible to put
python_sources(sources="**/*.py")
in a build file, and then you just need one build file ever, but its not particularly friendly for maintenance in the future. The general recommendation is to have one BUILD file per directory. I know its a bit of a hurdle when first migrating, but from my experience it fades into the background and stays out of your way fairly well.
👍 1
f
Got it. Thanks! And yeah, I appreciate all the work that went into the design to keep the boilerplate in those files at a minimum.
e
Good luck with the adoption
h
Yeah, to clarify - you don’t need a BUILD file in every dir.
tailor
creates these by default, but you could also have one top-level BUILD file per source root that uses
**/*.py
globs or similar.
However each source root does need its own BUILD files (at least one).
The reason
tailor
does one BUILD file per dir is that in the past if you had a coarse-grained top-level BUILD file and then you wanted to change the metadata of just one file or dir, you had to then split up that BUILD file into multiple (one per subdir at least), and it was quite a hassle to do so.
However, now that we have
overrides=
, where you can tweak the metadata of just one file in an existing target, that might not be as necessary any more
f
Thanks Benjy! That makes sense. I would like to cut down on the number of BUILD files if possible.
h
So you could create those one-per-source-root BUILD files manually, depending on how many source roots you have
note that you wouldn’t be able to rely on the default
sources=
for the various target types, as they are not recursive globs, so you’d have to be specific about those
and make sure that, for example, the sources of
python_sources()
and
python_tests()
don’t overlap
so you’d probably want to set the sources to be similar to the defaults for those target (which are documented) but with
**/
prefixes
if you do have a lot of source roots, and are creating new ones semi-regularly, then maybe you want to modify
tailor
to handle this
f
Overriding the defaults at the top-level BUILD file to be recursive worked:
Copy code
__defaults__(
    all=dict(
        sources=(
            "**/*.py",
            "**/*.pyi",
            "!**/test_*.py",
            "!**/*_test.py",
            "!**/tests.py",
            "!**/conftest.py",
            "!**/test_*.pyi",
            "!**/*_test.pyi",
            "!**/tests.pyi",
        )
    )
)
h
And this top-level BUILD file is above the source roots? Or is this one-per-source-root?
f
Above all the source roots
I'm already concerned this approach won't work for us though. I'm getting a ton of inference warnings because of ambiguous resolution -- even with
ambguity_resolution = "by_source_root"
. We have quite a few duplicate module names across the repo -- e.g.
base.py
,
util.py
, etc. I think Pants might be having trouble with ownership identification using these recursive source rules instead of putting a BUILD file in every subfolder. Trying to confirm this theory now.
Example: It's common to have
foo/a/base.py
and
foo/b/base.py
, and then for a module like
foo/a/bar.py
to have a statement
import base
(to import the base module from the local subfolder) instead of
import foo.a.base
.
In other words there are a LOT of places where we don't use fully-qualified import paths.
Hmm. Nope. Still getting ambiguity errors. 😕
Copy code
11:01:55.46 [WARN] The target foo/reports_summary.py imports `base`, but Pants cannot safely infer a dependency because more than one target owns this module, so it is ambiguous which to use: ['foo/base.py', 'foo/etest/test_foo/base.py', 'foo/etest/test_foo/test_bar/base.py', 'foo/etest/test_foo/test_qux/base.py', 'foo/sub/a_configs/base.py', 'foo/sub/b_configs/base.py', 'foo/sub/c_configs/base.py', 'foo/test/test_sub/test_c_configs/base.py'].
(sorry for having to semi-anonymize the filenames) So basically,
foo/reports_summary.py
has a line
import base
which intends to import
foo/base.py
. But there are a bunch of other modules named base.py underneath the
foo/
subfolder hierarchy seemingly causing the issue.
Copy code
by_source_root: Choose the provider with the closest common ancestor to the consumer's source root.
^ I am confused though because shouldn't
foo/base.py
be the "closest common ancestor" to
foo/reports_summary.py
?
This is probably hard to follow. I'll keep poking at it and if I can't understand the behavior I'll try to create a sample repository to demonstrate.
h
Hmm, Pants should be disambiguating by source root
but I wonder if that is an example of things that don’t work properly when the BUILD file is above the source root
(I wrote that disambiguation feature but it was a while back, and IIRC it depends on the BUILD file belonging to the source root in question)
f
Good news! I have it all working as I expected now. The disambiguation for the dupe module names is working after I backfilled a BUILD file into every subfolder.
🎉 1
I'm not sure what happened earlier. At one point I did an
rm -rf .pants.d
, maybe I was in an inconsistent state? 🤷‍♂️ Thanks so much for the help!
h
Great!
I think one BUILD file per source root would also work (or is that what you meant by “every subfolder”?) It’s just when the BUILD file is above the source root that I wouldn’t expect it to work
FWIW
rm -rf .pants.d
shouldn’t affect results at all, just performance
But who knows…