https://pantsbuild.org/ logo
h

hundreds-father-404

10/06/2021, 3:38 PM
Thoughts on changing
Paths.dirs
and
Snapshot.dirs
to only include the dir names where there was a match? if you have only the
file some_dir/subdir/f.go
and the glob
**/*.go
, it will include 
some_dir
when I only want 
some_dir/subdir
If you do want "all dirs", that seems much easier to post-compute than asking for "only dirs with a match"
h

happy-kitchen-89482

10/06/2021, 3:39 PM
Why?
What's the benefit of this? I fear unintended consequences, since we use those in a bunch of places.
But if it can be done cleanly...
h

hundreds-father-404

10/06/2021, 3:44 PM
Replace this target gen code with
paths.dirs
Copy code
dir_to_filenames = group_by_dir(go_paths.files)
    dirs_with_go_files = []
    for dir, filenames in dir_to_filenames.items():
        if any(filename.endswith(".go") for filename in filenames):
            dirs_with_go_files.append(dir)
We're duplicating it in Terraform target gen too
w

witty-crayon-22786

10/06/2021, 5:20 PM
the semantics are currently that only dirs that actually match the glob are included.
it’s a bit strange, but it makes source roots easier, because you can “find all directories like…”
having said that, i think that if we start moving toward “matching a directory means recursively everything in that directory”, we’ll want to change our path globs as well.
h

hundreds-father-404

10/06/2021, 5:30 PM
That doesn't seem the case? The result includes some_dir and I only want some_dir/subdir
w

witty-crayon-22786

10/06/2021, 5:34 PM
but probably even more reason to consider replacing capturing dirs with recursive behavior.
h

hundreds-father-404

10/06/2021, 5:37 PM
It would be extremely useful if it was "only dirs with a match".
even more reason to consider replacing capturing dirs with recursive behavior.
Why change our Rust code to do that? Wouldn't the calling code simply use
**/*
for the glob? I'm talking about the behavior of the Rust intrinsic
w

witty-crayon-22786

10/06/2021, 5:37 PM
because if
./pants test $dir
is going to be recursive, aligning our path/source globs could make sense semantically
h

hundreds-father-404

10/06/2021, 5:40 PM
I'm not seeing that. In that case, the caller rule could convert
dir
into
dir/**
. We need to retain the ability to have a glob only for the directory and not recurse,
*.go
vs
**/*.go
Maybe I'm misunderstanding your suggestion?
w

witty-crayon-22786

10/06/2021, 5:40 PM
@hundreds-father-404: you’re thinking of the rules, but i’m thinking of end user source globs, and making “matching a dir is recursive” consistent across Specs and sources
basically, if matching a dir is recursive, i don’t think we’d still have a
Paths.dirs
list, because the reason you would have matched would be for recursion
(which would have the same motivation as for `Specs`: consistency with git)
h

hundreds-father-404

10/06/2021, 5:43 PM
you’re thinking of the rules
Indeed, I'm solely talking about the behavior of our intrisinics
Paths
and
Snapshot
, how their
.dirs
field works We can change our user semantics to have
dir
translate to
dir/**
without changing our engine code
because the reason you would have matched would be for recursion
For those two callers perhaps, but there are other cases where it's helpful to have the list of dirs with a match, like my Go target gen idea
w

witty-crayon-22786

10/06/2021, 5:45 PM
only include the dir names where there was a match? if you have only the 
file some_dir/subdir/f.go
 and the glob 
**/*.go
, it will include 
some_dir
 when I only want 
some_dir/subdir
i don’t know how these semantics would work:
some_dir
doesn’t actually match the glob you specified
**/somedir
would,
***/**
would, etc
to get what you want, i think you want to
os.path.dirname
the matches…?
h

hundreds-father-404

10/06/2021, 5:49 PM
This is how the current behavior works. When you have the glob
some_dir/subdir/f.go
,
.dirs
will include
some_dir
and
some_dir/subdir
. It includes all directories seen in the resolved files I'm proposing to instead only have the directories that contain a matched file. There is a matched file in
some_dir/subdir/f.py
, so include
some_dir/subdir
. There is no matched file in
some_dir
, so leave it off. It's not that the directory itself was in the glob, it's that the dir has a matched file
w

witty-crayon-22786

10/06/2021, 5:49 PM
That is
dirname
h

hundreds-father-404

10/06/2021, 5:49 PM
I think that's why we're talking about different things. I am not proposing any changes to the input to
PathGlobs
. Only
Paths.dirs
That is dirname
It is, indeed! Which is what this snippet does:
Copy code
dir_to_filenames = group_by_dir(go_paths.files)
  dirs_with_go_files = []
  for dir, filenames in dir_to_filenames.items():
     if any(filename.endswith(".go") for filename in filenames):
         dirs_with_go_files.append(dir)
But I'm proposing it would be useful for the engine to calculate that for you already, rather than all this Python support code
If you want the current behavior of "all directory paths", you could still get that with my proposed change thanks to extra Python code similar to above. Both semantics are 100% achievable to calculate currently via extra Python code. But which is more useful to have pre-computed, vs. requiring the extra steps?