Followup to `AddressSpec` discussion. I think the ...
# development
h
Followup to
AddressSpec
discussion. I think the other type of top-level
Spec
should be a
FilesystemSpec
, which may either be a
FileSpec
,
DirectorySpec
, and some still-unknown type for globs. Rules would then request
AddressSpec
or
FilesystemSpec
It wasn’t clear to me until recently that
FileSpec
and
DirectorySpec
need to have the same superclass so that rules can request
FilesystemSpecs
without caring whether users specifying them as files or directories, just as they don’t care how
AddressSpecs
were specified on the command line. Sound good?
a
could easily imagine git urls or urls to files/tgzs as new
Spec
types here
i like this foundation a lot!
h
Yes! Benjy proposed that as a possible future development in https://github.com/pantsbuild/pants/issues/8593 For now, though, I’m trying to stay focused on supporting
FilesystemSpec
while keeping it generalizable enough to allow future specs
a
100% agree with that approach!
w
ftr:
FileSpec
vs
DirectorySpec
requires filesystem access to determine... i don't think it can be determined via parsing, as with the rest of these.
👍 1
src/com/blah/thing
may be either a file or a directory
h
Specifically a file without an extension, right?
w
Yes
Or with an extension, for that matter.
1
Directories can definitely still contain punctuation.
👍 1
h
Hm, tricky
w
somewhat...?
the "address" specs represent a user intent that you convert into some concrete matches (Addresses)
👍 1
directories are less of a thing than "files" in pants. Snapshots capture both, but mostly because it was cheap to, and kept the API simpler
👍 1
rules have the
PathGlobs->Snapshot
API to capture Snapshots
but... typing all of this out, i think it would be good to talk more about where you are trying to go with this.
if the idea is to unify
--owner-of
with other root selection, then you generally would not actually expose the raw parsed inputs
but if you did expose the raw parsed inputs, the parser would need to be able to differentiate
SingleAddress
from a file or a directory... which it can't really do without filesystem access
@aloof-angle-91616 did work in this area around
--query
... i think in one of the lead-up commits.
(where he unified OwnersRequest with Specs, iirc.)
a
that’s correct
it could be split out into its own PR
without difficulty
i should have done that long ago actually
w
not to worry. most important thing here would be to help Eric reason though what he's trying to do, as it may not be shaped exactly the same way (maybe, who knows)
a
just went home because i taught a class today and it was exciting but stressful but after i lie down a bit we can slack or hangouts to determine the answer to that @hundreds-father-404
h
i think it would be good to talk more about where you are trying to go with this.
The goal is to go from
FilesystemSpec
to something like
Snapshot
(or
PathGlobs
, unclear). There would be an optional rule to go from
FilesystemSpec -> Address
(or maybe
Snapshot -> Address
?), but generally a
FilesystemSpec
in no way needs to correspond to an address
. The path does need to exist on the FS, but it doesn’t need to be linked to a BUILD file Motivating use case for this:
./pants buildgen
run on a brand new repo won’t have any BUILD files defined yet
w
and what is the intended argument to
./pants buildgen
...?
h
./pants buildgen src/python
would generate BUILD files for all of
src/python
w
it feels like it could consume raw Specs (rather than *Targets) to generate things.
a
any reason not
src/python::
?
w
and then you would
./pants buildgen src/python::
👍 1
h
Another example:
./pants cloc src/python/pants/backend/lint/isort/rules.py
should only run on
rules.py
, not be expanded to the whole target (the whole folder)
🔥 1
a
src/python::
is still a valid spec (which just matches 0 targets) if the dir doesn’t exist
and if we don’t create a target with the default folder name, then
src/python
suddenly points to a target now which doesn’t exist
yeah, we could actually maybe remove the
owner-of
option entirely and just use file paths ^ re eric above
w
the cloc example is a good one. but i think that it is probably rare compared to
./pants test $filename
probably wanting to run the tests for the target owning the file, possibly with the test running filtering to the file in some runner-specific way.
cc @happy-kitchen-89482: ^
a
idk what others feel about this but i personally don’t like that we accept trailing slashes in target specs atm and i would love if we could choose exactly one of the forms
src/python::
vs
src/python/::
👍 1
i prefer
src/python::
personally but users at twitter are mostly conditioned to use the
/::
style i’ve noticed and both seem fine
w
the trailing slash is a "command line affordance" to allow for tab complete in a shell
👍 1
a
@witty-crayon-22786
the cloc example is a good one. but i think that it is probably rare compared to
./pants test $filename
probably wanting to run the tests for the target owning the file, possibly with the test running filtering to the file in some runner-specific way.
the runner-specific part is what we could potentially cut out here by allowing direct paths to files in the command line, i was thinking
w
it's not allowed in BUILD files (or isn't supposed to be)
a
it’s not allowed in BUILD files! and ok, thank you for that context!
w
@aloof-angle-91616: the issue is that you still need the target-specific information to determine the dependencies
a
yes, but i was thinking that we could interpret a file path on the command line as being an `--``owner-of`
so the resulting command is still unambiguous — expanding to all targets that own the file
i’ll scroll up to see if i’m missing any context
w
right: owner of is what i'm suggesting. but that is not what (i think) Eric was suggesting with the cloc example.
a
random idea but i've been thinking about it in other contexts recently (like vcfs):
but if you did expose the raw parsed inputs, the parser would need to be able to differentiate
SingleAddress
from a file or a directory... which it can't really do without filesystem access
true, but if we can trust pantsd being alive, we would be able to trust that we have an up-to-date view of the filesystem in the engine available at all times that we can query very quickly if pantsd is up
right: owner of is what i'm suggesting. but that is not what (i think) Eric was suggesting with the cloc example.
oh! sorry! i missed this part!
yes, that would be a different interpretation of it. thanks.
hmmmmm
h
Benjy and I were discussing that rules decide what they want: 1. conventional targets, then request
AddressSpecs
(or
TransitiveHydratedTargets
,
BuildFileAddresses
, etc) 2. raw files without any target information, then request
FilesystemSpecs
3. allow file system specs, but ultimately the rule needs targets to work, e.g. `./pants test`; use the rule that converts
FilesystemSpecs -> Address
, which is the rule implementation as
--owner-of
My main point is that we should allow this second use case. Most goals do need target information to work properly, like test, but some don’t need it at all, like cloc and buildgen
a
how do rules ask for
AddressSpecs
vs
FilesystemSpecs
?
and where is the parsing happening?
i would love it if we could avoid parsing the command-line arguments (or at least the positional args) at all until a rule requests it specifically, and i would love it if a rule could selectively parse different parts of the argv differently i.e. maybe i expect an
AddressSpec
first, and then a
FilesystemSpec
h
Today, in the rule signature, you can request
Specs
(soon to be
AddressSpecs
). That will still be supported. We will allow instead requesting in the rule signature
FilesystemSpecs
, which you could convert into something like
PathGlobs
or
Snapshot
. I’m not sure yet what exactly the rule signature would look like to request
"either FilesystemSpec or AddressSpec is fine"
a
possibly related: https://github.com/pantsbuild/pants/pull/8542 which allows
@union
classes to be used as arguments directly
or no, that work is actually in another PR i think. but that's the idea (where the
@union
class is expanded into a version of the rule for each of its union members, at rule indexing time)
w
we currently parse outside the engine. parsing inside the engine would be possible.
does my edit make it clearer?
👍 1
a
yes!
and is what i was thinking
and honestly that vision doesn't seem to impose too many constraints on how we do things now
h
that vision
which vision? My proposed 3 use cases or the vision of parsing in the engine?
a
if we can get access to the "raw" cli args (or maybe just the positional ones for now) in a rule we should be able to build whatever on top of that
☝️ 1
parsing args with `@rule`s and never writing a custom command line parser ever again
currently Specs are parsed outside the engine, and injected via a RootRule
injecting something more raw, and then parsing it inside the engine would make FS access ... more natural, perhaps.
👏 1
a
https://github.com/pantsbuild/pants/pull/8917 <= calculating target roots in v2, broken out from the
--query
PR
h
My thinking is a little more abstract: A pants cmd line invocation consists of
./pants <some flags> goal <positional args to the goal>
(ignoring multiple goals, for simplification, but same principle applies). Currently the positional args are always interpreted as address specs (and are in fact validated as such pretty early on - i.e., they cannot reference nonexistent targets, and must refer to at least one existing target). At the other extreme we could say that positional args can be anything at all, and it's up to the goal to interpret them. E.g., if the goal knows that it expects addresses, it can request BuildFileAddresses and if the positional args cannot be converted to valid addresses, then and only then do we error.
a
that's what i was thinking
h
BUT - Stu made a point that this is too general, and I can get behind that.
a
we don't need to go 100% for either approach right now though
h
However I do still think that it is useful to expand the universe of possible positional args from just addresses to also include file paths, existent or not.
So I'm thinking that the cmd-line arg parser (which today produces Specs) should instead produce a preliminary type, PositionalArgs. And we have a rule that can turn those into AddressSpecs, and a rule that can turn those into FilesystemSpecs. A goal can ask for either, and the appropriate rule will be invoked in the usual way.
👌 1
But the parser needn't know how the positional args are going to be interpreted.
👍 1
That is up to the rule
👖 1
Our initial use case for this is buildgen - there are no targets to address
But cloc is another example, and linting could be another
there are several cases where we don't actually need target information, just the content of the source file.
a
i personally would prefer the
./pants buildgen src/python::
syntax but i have no issue starting the positional args conversation earlier
h
And this opens the door to things like
./pants test src/python/foo/bar_test.py
a
there are several cases where we don't actually need target information, just the content of the source file.
i hadn't considered this! thanks
h
Which does need target information (dependencies, interpreter constraints, whatever)
But the user's intent is obvious: Run the tests in this file. And we can do that!
🔥 1
Not initially
a
the user's intent is obvious:
great to identify cases like this where the current pants model holds us back
h
But we can eventually. And this is a better user experience. Targets are an extra conceptual level that we currently impose on users, and it would be nice not to when we don't have to.
💯 3
a
cannot agree more
h
If by "current pants model" you mean the v1 task/target model then yes, that holds us back. But the new engine is so freakin' powerful that it can support this with very little disruption!
a
one more thought that's probably more relevant in a large org: if we can expand the interpretation of pants command line args and make
./pants run
reliable and fast, individual teams can ship tools just as targets in our monorepo instead of having to supply them elsewhere
h
sure
a
the v2 python pipeline development is integral to this
just a random thought though
h
Anyway, that's my handwavy thinking on why there need not be targets involved at all in some pants invocations.
✍️ 2