Hello! One of the most common gotchas with new use...
# general
h
Hello! One of the most common gotchas with new users is confusion with
python_library
, e.g. thinking it's more meaningful than it really is like that it corresponds to a Python distribution (setup.py). Proposal to fix this by renaming some targets https://docs.google.com/document/d/1afm8NhCuE19YyNZHJCrig9EnaV0eJrSsIVWXL7eDflA/edit#. tl;dr: -
python_library
->
python_sources
-
protobuf_library
->
protobuf_sources
-
python_requirement_library
->
python_requirement
Ack that this is disruptive; we'd provide a snippet to fix this automatically, like when we renamed
python_binary
->
pex_binary
. Feedback appreciated!
🙌 2
🤔 3
f
will that change scale to other languages? does it matter? i think python developers might struggle with this because normally only notion of building in python is linked with packaging steps (like in setup.py)
👍🏽 1
h
will that change scale to other languages?
Good question. I'll add a section sketching what Java might look like with this naming.
does it matter?
Does which part matter?
🙌 1
f
does it matter that you might have
python_sources
vs
java_library
or
java_classgroup
or whatever the case may be there
h
Ah, I think it does matter. We want a consistent design language. This is why renaming
protobuf_library
matters too imo A big insight I've had recently is a separation between targets describing 1) your own code, 2) third-party reqs, and 3) artifacts. We used to muddy that, like
provides=setup_py()
being defined on a
python_library
target, or
pex_binary()
having a
sources
field. Now, we have targets whose sole purpose is describing source files vs. targets for abstract metadata on how to build something For that first category, I'm thinking
lang_sources
and
lang_tests
should scale well
f
okay makes sense
j
Is the idea that
python_sources
contains the meta data for a group of python files that form a coherent "thing" (now known as a library)? I don't see how changing library to sources reduces the confusion. A python developer still needs to know that another target is needed for making a pex (
pex_binary
) and a wheel/zip (
python_distribution
). Are you trying to create the dichotomy between sources and 3rd party requirements? Requirements come from the
pip
ecosystem. If you are going this route, why wouldn't you also use the name
python_pex
instead of
pex_binary
?
Is the idea to have the general pattern be
lang_TargetMeaningfulToThisLanguage
?
h
Is the idea that python_sources contains the meta data for a group of python files that form a coherent "thing" (now known as a library)
That's the thing, there need not be any coherence at all. It could be a single source file, or all your source files, or 5 of them etc. All it is is metadata over n number of Python files, which all share the same explicit metadata like
interpreter_constraints
. Right now, we incorrectly suggest they must be coherent
j
Will this change the 111 pattern?
I guess I understand the reason for the change.
h
No, you can still use it. This is solely proposing a rename to bette express the semantics of the target type: metadata over n number of source files. Rather than right now suggesting "a logical grouping of code that together forms a library"
j
I am just looking forward to when these kinds of changes have stopped because it is is not fun to keep updating
BUILD
files in this way, even with tooling to help.
👍 1
I guess when our repo has started to take advantage of reducing the size of
BUILD
files through inference, then it won't matter as much.
What about
pex_binary
? Does that make sense given it defines metadata for a python only target?
h
Agreed that there's been a lot of churn recently - dependency inference + "file targets" hugely changed the paradigms for a lot of things, which kept leading to yet more new insights. For example, in a v1 world, it was sensible for
provides=setup_py()
to live on a
python_library()
. That became illegal to do once we added "file targets", it would break most the time. Which led to
python_distribution
, which led to this insight that there's a distinction between "metadata describing first party code" vs. "metadata describing something you want to build" -- FYI the major remaining churn we anticipate is wanting to solve the problem of it being hard to add explicit metadata to a granular part of your first party code, without needing lots of granular targets. A common pattern (now) is to have one
python_library
target describing >20 files thanks to
**
rglobs. When you add an explicit dep like a database that Pants can't infer, then now all 20 files get that even if only 1 file actually needs it. 111 mitigates this problem, but is extra boilerplate We're envisioning a way where you can somehow merge metadata, like say "these 200 files use these interpreter constraints; these 5 files have a database dep". Without needing to have distinct targets for everything. The trick is how do we do that in a way that isn't majorly disruptive... It's another big paradigm change we couldn't envision before dep inference because we were blinded by the way every monorepo Build Tool™️ has done things since the start.
What about pex_binary?
Eh, yeah, possibly it should have been
python_pex
. But I think we're extremely unlikely to change it one more time. That's too disruptive. I think
python_awslambda
makes sense because there are multiple ways you can create an AWSLambda. Personally, I wish we called
python_distribution
a
setup_py_dist
or
setup_py_binary
. But too late to change
👍🏽 1
h
I do think the naming changes make sense. And we can support the old names in parallel for a while, so things don't break when you upgrade.
I think
python_requirement
is too confusing with the existing
python_requirements
macro though
h
Agreed it's deceptively similar looking. I wonder if we should think bigger for the rename. Maybe
python_dependency
?
python_3rdparty
?
h
And the singular name is also technically incorrect, since a
python_requirement_library
can encompass multiple requirements.
1
h
Maybe python_dependency?
This sounds the most natural to me, but it's confusing with our
dependencies
field meaning either 1st or 3rd party deps.
python_3rdparty_deps
could make sense. I think I'm not super concerned about verbosity for this target type because it's not used frequently
f
i think your reasoning behind the name change makes sense, and could even be justified, but the cost of churn is pretty high. I understand that the cost of even leaving deprecated aliases is also high in terms of code complexity, etc
but i do urge some caution with churn, especially for renaming things...the cost of that churn is often born by the biggest advocates for pants within orgs that use it; you shift the costs onto your biggest fans
4
h
The Target API's design means that it's actually not very painful to support the deprecated alias - maybe that's the path forward. We're trying to find a balance of making things easier for new folks to onboard, while still giving a good experience for current users. The thing I more care about is in
./pants help
and our docs being able to consistently use the new names. That's fine to have hidden names for the same thing.
f
there's not a really good name for the concept you describe: "a set of source files that share some common metadata", so i get where you're going with it
"fine to have hidden names for the same thing" until the same newbie comes along some old BUILD files and has to learn that there are aliases and make the mapping between those aliases
👍 2
there's a cost to both... ¯\_(ツ)_/¯
h
there's not a really good name for the concept you describe: "a set of source files that share some common metadata",
Agreed, and that speaks to the bigger thing that we're realizing we want to do, but don't yet know how, particularly in a way that we don't screw over current users: somehow replace the idea of targets for first-party code with this merging-of-metadata idea Targets seem to work well when describing third party requirements + artifacts you want to build. They map very nicely 1-1 with the thing you're describing. Targets get really clunky when describing first party code, though.
f
i agree that
python_library
may be confusing for newbies, but i'd suggest that changing names doesn't universally solve that problem, since a newbie approaching an existing codebase that uses it (either due to people pinning their versions back or backward-compatible aliases) will still have to make the mapping between that confusing term and whatever new term you use in docs or in
./pants help
it only helps newbies on greenfield projects or super up-to-date mature ones
e
Is naming the real confusion or is it existence? I know Pants fairly well and am continually bamboozled by the fact I need to lay down an empty
python_library()
declaration for my
pex_binary(...actual metadata I wrote down...)
to work. Whether the - what looks like a no-op -
python_library()
was a
foo()
or a
bar()
I'd still be mystified by the need. I can go read excellent docs that explain this to me, but at that point the newbie game is up and I can learn what
python_library()
the word means.
h
Indeed, its existence is the bigger issue. https://pantsbuild.slack.com/archives/C046T6T9U/p1610655122036800?thread_ts=1610638933.024700&cid=C046T6T9U Renaming is meant to make things more clear, but the better solution is to redesign "metadata for first party code" from first principles
e
OK, great. That's felt like a hard sell in the past but it seems we're all on board with the long term goal.
j
Pyproject uses the term packages for things to include (modules, packages, directories of python code): https://python-poetry.org/docs/pyproject/ python-packaging (setup.py) uses
install_requires: <https://python-packaging.readthedocs.io/en/latest/dependencies.html|https://python-packaging.readthedocs.io/en/latest/dependencies.html>
setup.py
uses packages to name the code that is around the file and not explicitly named within the file.
h
The key nuance for me is that targets are broken for first party code. I do think they work well for third party requirements + artifacts you want to build
e
Well - by definition you need metadata to even name third party code.
j
I think this shows that there is no community standard that we can lean on that will make pant’s naming 100% understandable without some initiation.
👍 1
e
Agreed.
Eric's suggestion is about the only language neutral one I can image (sources). Library seemed similarly language neutral to the inventors but is apparanetly not.
🤔 1
👍 2
j
If we continue to have good documentation and good enough names, I think the particular names we choose for targets will not be a big hurdle to the adoption of pants.
👍 3
e
+1
j
I like
python_packages
or
python_requirements
but the latter has a macro namespace collision that makes it not work in pants-land.
👍 1
h
True. One concrete thing we can improve is a small "warning" tooltip clarifying this misconception people have with
python_library
. We've had 4 users the past few months be tripped up by this iirc, I do think it's a real problem. But the better solution is to fix targets for first party code, so this rename introduces possibly unnecessary churn.
✔️ 2
e
I think the latter is artificial. If the implementation of
python_requirements
were changed to a target a rule could take empty python_requirements and slurp in a sibling requirementstx, it could take, otherwise, either a file= or *reqs args.
👍🏽 1
j
I also like the way pyproject let’s one refine the source of the package/module: via pip, as
sdist
, a module directory, etc. Letting a particular
BUILD
define all its dependencies independent of the monorepo would make refactoring some of our over entangled projects easier to upgrade.