I m stuck at how to differentiate between filesystem specs v Pants #development

I’m stuck at how to differentiate between filesyst...

hundreds-father-404

01/17/2020, 11:48 PM

I’m stuck at how to differentiate between filesystem specs vs. address specs in

cmd_line_spec_parser.py

. Several ambiguities: 1. does

src/python/pants/util

mean

src/python/pants/util:util

or the directory

src/python/pants/util

- For now, filesystem specs don’t work with directories (

PathGlobs

doesn’t), we’ll stick with expanding to

src/python/pants/util:util

2. Not all files have endings, e.g. the file

./pants

I believe it is safe for the parser to use

Path.is_file()

. We already do

Path.exists()

arg_splitter.py

. So, I’m thinking these heuristics: -

Path.is_file()

-> filesystem spec -

in the arg -> filesystem spec because it’s a glob - otherwise, address spec

hundreds-father-404

01/17/2020, 11:48 PM

cc @aloof-angle-91616, I think you’d find this a fun problem

hundreds-father-404

01/17/2020, 11:49 PM

Note that the options parser will already have disambiguated between

specs

vs. goals for us, fortunately. That is, it already clarified whether

cache.java

refers to the goal

cache.java

or the file `cache.java`; and whether

haskell

refers to directory

haskell

vs. the goal

haskell

aloof-angle-91616

01/17/2020, 11:50 PM

does src/python/pants/util mean src/python/pants/util:util or the directory src/python/pants/util

so right now, for a lot of cases, it seems like the desired behavior when selecting a directory would be to just operate on all the targets within that directory

👍 1

aloof-angle-91616

01/17/2020, 11:50 PM

which is the same as using

::

hundreds-father-404

01/17/2020, 11:51 PM

Absolutely agreed

hundreds-father-404

01/17/2020, 11:51 PM

Due to backwards compatibility, we can’t change those semantics yet, though. V1 will not support filesystem specs, only traditional address specs and

--owner-of

aloof-angle-91616

01/17/2020, 11:51 PM

and if we're talking about a directory used as an input or output to a task or something that would obviously be its own separate thing

aloof-angle-91616

01/17/2020, 11:52 PM

well i was thinking we wouldn't need to change any semantics

aloof-angle-91616

01/17/2020, 11:52 PM

unless there's a use case for a positional directory command-line arg that isn't covered by

::

hundreds-father-404

01/17/2020, 11:53 PM

well i was thinking we wouldn’t need to change any semantics

It’s confusing to me for

src/python/pants/util

to sometimes mean

util/

and sometimes

util:util

. It’s confusing because it’s not transparent. We’d need to first drop support for

src/python/pants/util

expanding to

src/python/pants/util:util

, which I’d be in favor of (in a followup - not close to being ready yet)

aloof-angle-91616

01/17/2020, 11:55 PM

to sometimes mean util/ and sometimes util:util.

well, until now, it has always just meant

util:util

. the proposed new behavior i believe you're describing is to have it now sometimes mean

util/

as well, which you've described as confusing and not transparent.

hundreds-father-404

01/17/2020, 11:55 PM

Cool. It is safe to use

Path.exists()

. We use in the same file

os.path.realpath()

(==

Path.resolve()

)

aloof-angle-91616

01/17/2020, 11:55 PM

i was asking above about a use case not covered by

::

because if not, then we may not need to introduce the

util/

interpretation at all

aloof-angle-91616

01/17/2020, 11:55 PM

yes but beware of perf regressions

👍 1

hundreds-father-404

01/17/2020, 11:56 PM

To clarify, I meant to punt on the directory ambiguity problem. For now, goal is resolving files and globs

aloof-angle-91616

01/18/2020, 12:00 AM

yes great

aloof-angle-91616

01/18/2020, 12:00 AM

i think this won't be incredibly difficult to solve eventually

👍 1

aloof-angle-91616

01/18/2020, 12:01 AM

either we introduce some signifier to disambiguate them (like a protocol schema) or we like screw around with other cli arguments somewhere else

aloof-angle-91616

01/18/2020, 12:01 AM

or something

witty-crayon-22786

01/18/2020, 1:18 AM

I believe it is safe for the parser to use
Path.is_file()
. We already do
Path.exists()
in
arg_splitter.py
. So, I’m thinking these heuristics:

this isn't safe without relativizing to the buildroot. but it is in general a sandbox escape

👍 1

witty-crayon-22786

01/18/2020, 1:19 AM

if that path goes from being a file to being a directory, the rule will still be memoized, because it did not use the engine's APIs to access the file.

witty-crayon-22786

01/18/2020, 1:19 AM

so would need to capture a Snapshot to differentiate the type.

hundreds-father-404

01/18/2020, 1:19 AM

Note that this parsing happens before the engine ever gets involved. It’s in

base/cmd_line_spec_parser.py

witty-crayon-22786

01/18/2020, 1:20 AM

...oh. but you're talking about in ``cmd_line_spec_parser`` ... sorry. that's ok unless it is actually moved into a rule.

witty-crayon-22786

01/18/2020, 1:20 AM

yep. got it, sorry.

average-vr-56795

01/20/2020, 12:16 PM

I don’t love that creating a

BUILD

file will change the meaning of a command line… That feels like a bad property of the UI…

average-vr-56795

01/21/2020, 3:23 PM

@hundreds-father-404 Not sure if you saw ^^

hundreds-father-404

01/21/2020, 5:15 PM

I don’t love that creating a BUILD file will change the meaning of a command line… That feels like a bad property of the UI…

What do you mean by this? Is this in reference to disambiguating between directories vs. targets? To clarify, I 100% want to punt on this decision (But ftr, I really like Benjy’s suggestion to add a global option that allows users to specify which semantics they want)

aloof-angle-91616

01/21/2020, 5:16 PM

+1 to both daniel and eric regarding punting

aloof-angle-91616

01/21/2020, 5:17 PM

a global option that changes the semantics of the command line feels weird to me, i’d somewhat like to be able to parse a command line in my head without spooky action at a distance

average-vr-56795

01/21/2020, 5:37 PM

So the question is what does

./pants cloc foo

mean in the following directory layouts: 1. foo is a file 2. foo is a directory 3. foo is a directory and

foo/BUILD

exists I don’t think a flag should change that meaning, and ideally that meaning wouldn’t change even if filesystem changes are made to change which one of 1/2/3 is the case. Otherwise when I write my script, and I write it slightly sloppily because I don’t understand the edge cases, my script may start doing the wrong thing when faced with a different filesystem than I tested it on.

hundreds-father-404

01/21/2020, 5:38 PM

Right now: 1) If

foo

is a file, treat it as a file 2) Otherwise, always try to expand to

foo:foo

, even if there is no BUILD file or directory

aloof-angle-91616

01/21/2020, 5:39 PM

i don't understand when we would want to pass in directories as command-line arguments. what use case is this for?

hundreds-father-404

01/21/2020, 5:40 PM

Our original motivating use case for filesystem specs was wanting to be able to run

./pants buildgen src/python 3rdparty/python

on a brand new Pants repo with no BUILD files

aloof-angle-91616

01/21/2020, 5:40 PM

how would you feel about

./pants buildgen src/python:: 3rdparty/python::

aloof-angle-91616

01/21/2020, 5:41 PM

that's really what you want with buildgen right? not just the directory but recursive subdirectories?

hundreds-father-404

01/21/2020, 5:41 PM

./pants buildgen src/python:: 3rdparty/python::

means you are using address specs, but there are no BUILD files yet so this won’t actually match against anything

aloof-angle-91616

01/21/2020, 5:42 PM

right now that would fail to expand globs and cause an error, but we can choose to instead interpret it as a glob for the purposes of the

buildgen

task

hundreds-father-404

01/21/2020, 5:42 PM

With today’s implementation of filesystem specs, you could do

./pants buildgen 'src/python/**/*' '3rdparty/python/**/*'

aloof-angle-91616

01/21/2020, 5:42 PM

ok, but we're creating pants targets right? why is

/**/*

better than

::

average-vr-56795

01/21/2020, 5:43 PM

ok, but we’re creating pants targets right? why is
/**/*
better than
::
?

Because it’s a really weird special case, both in the user interface and in the engine

aloof-angle-91616

01/21/2020, 5:43 PM

unless we're literally like downloading a .tar.gz and literally extracting it into some directory, i feel like using target specs makes more sense

aloof-angle-91616

01/21/2020, 5:44 PM

Because it’s a really weird special case

i think we can just avoid erroring when

::

doesn't match any targets though? the reason i like this is because right after you run

buildgen

, you can immediately change the command-line goal, and not have to change any address specs, so you can run

./pants list src/python:: 3rdparty/python::

immediately. that seems SUPER intuitive to me.

hundreds-father-404

01/21/2020, 5:45 PM

ok, but we’re creating pants targets right? why is /**/* better than ::?

cmd_line_spec_parser.py

needs some way to know “they want filesystem specs” vs. “they want address specs”. The engine may only pass one type of Specs to the goal, otherwise we get graph ambiguity that there are two ways to compute things like

BuildFileAddresses

. So, we must reserve

::

to always mean address specs and never try to co-opt it for filesystem specs. This means we need some sort of glob for filesystem specs to complement

::

is a natural glob that is already supported by

PathGlobs

and is natural to CLI users

aloof-angle-91616

01/21/2020, 5:45 PM

filesystem specs are great, i have no issue with them. for the case of buildgen, i feel like they're inappropriate compared to

::

because you then have to change the pants command line specs to operate on the targets you just generated.

hundreds-father-404

01/21/2020, 5:49 PM

for the case of buildgen, i feel like they’re inappropriate.

Buildgen is meant to work on a brand new repo, though, with no targets. So,

./pants buildgen ::

won’t resolve anything on a first run. We could hack the AddressSpecs

::

to be converted into filesystem specs, which is what Toolchain currently does, but that’s not great to change the semantics. After that first run, you can use

./pants buildgen ::

of course.

average-vr-56795

01/21/2020, 5:49 PM

Do we have any cases where we actually really want file specs for anything other than buildgen?

hundreds-father-404

01/21/2020, 5:50 PM

Do we have any cases where we actually really want file specs for anything other than buildgen?

Well, to clarify, we want to support either filesystem specs or address specs for goals. Another example:

./pants test tests/python/pants_test/util/test_strutil.py

aloof-angle-91616

01/21/2020, 5:55 PM

We could hack the AddressSpecs :: to be converted into filesystem specs, which is what Toolchain currently does, but that’s not great to change the semantics.

that's not what i'm thinking of. i'm thinking of them representing target globs. the

buildgen

task would read the

Specs

as target globs, convert each

Spec

into whatever type of input

buildgen

needs, then run it. i'm not thinking of making

::

into filesystem globs.

hundreds-father-404

01/21/2020, 5:56 PM

Why do we want to allow

./pants test tests/python/pants_test/util/test_strutil.py

? Leveraging how users already interact with most tools. For example, you’d run Pytest as

pytest tests/python/pants_test/util/test_strutil.py

. Same with running MyPy, isort, and Flake8. Users can now use tab completion on specs, which currently only works when specifying the directory Finally, you no longer need to lookup what the target name is that corresponds to that file (a huge pain for me doing the Py3 migration)

aloof-angle-91616

01/21/2020, 5:57 PM

Finally, you no longer need to lookup what the target name is that corresponds to that file

./pants --owner-of=tests/python/pants_test/util/test_strutil.py test

doesn't work?

aloof-angle-91616

01/21/2020, 5:57 PM

to be clear, i strongly support the use of filesystem specs for the case of running targets that own a filesystem spec.

hundreds-father-404

01/21/2020, 5:58 PM

I’m thinking of them representing target globs. the buildgen task would read the Specs as target globs,

Please checkout https://docs.google.com/document/d/15xphZcFnowytF0Qu2sO1PG7wHAWoyLof7hK0jVa5T8E/edit#heading=h.dn7v2rykapob if you haven’t recently. It was rewritten this past week to account for a) a new understanding of the different types of use cases we have. There are 4 types of goals, not the original 2 I thought. b) having had spent the past two weeks in this spec code, a new understanding of how this can actually be implemented. For example, we dropped the idea of a top level

Specs

average-vr-56795

01/21/2020, 6:04 PM

Please checkout https://docs.google.com/document/d/15xphZcFnowytF0Qu2sO1PG7wHAWoyLof7hK0jVa5T8E/edit#heading=h.dn7v2rykapob if you haven’t recently. It was rewritten this past week to account for

I will have a read 🙂

❤️ 1

aloof-angle-91616

01/21/2020, 6:14 PM

left some comments, back to work

happy-kitchen-89482

01/21/2020, 8:26 PM

Just to provide more general context: Forcing end users to think about targets is an imposition. It hinders adoption. Users already understand files. Letting them do

./pants test src/python/foo/bar_test.py

is intuitive,

./pants test src/python/foo:bar

is less so. It requires reasoning about a new conceptual domain. Targets may still need to exist (e.g., to express deps and interpreter constraints), but the user should only have to think about them when necessary.

❤️ 1

🔥 2

👖 1

happy-kitchen-89482

01/21/2020, 8:26 PM

Fortunately, the v2 engine makes this straightforward!

aloof-angle-91616

01/21/2020, 8:27 PM

ok, that’s a very interesting approach i hadn’t considered before. i love it

Open in Slack

Previous Next