Re <https://github.com/pantsbuild/pants/issues/137...
# development
a
Re https://github.com/pantsbuild/pants/issues/13741 — I feel that our JVM dependency inference parsers may benefit from knowing what (partial) packages Pants knows about. Motivation in thread
If I have the following (ignoring that
java.util.function
is always an implicit dependency):
Copy code
return java.util.function.Function.identity().apply("Hello, World!");
then JavaParser thinks that
java.util
and
java.util.function
and
java.util.function.Function
are all “field access” expressions, so it’s not immediately clear at which point an actual type is accessed. I could theoretically make it output
java
,
java.util
,
java.util.function
and
java.util.function.Function
as consumed types, but it’s not clear to me if this is wasteful. (cc @fast-nail-55400 @witty-crayon-22786)
w
@ancient-vegetable-10556: on the scala side, i think that the approach has been to expose all declared* symbols
a
HANDY!
w
…rather than only types
a
I’ll do that and we’ll see what happens 🙂
the main thing is that we don’t know what’s a type and what’s a symbol
w
on the consuming side, right
a
right
w
on the producing side, the scala extractor doesn’t differentiate (so far?) because the consumer can’t know anyway
a
Hang on, I think I confused
our java parser would need to output all symbols, because it can’t tell the difference between symbols and types
w
all declared symbols, yes
a
the consuming code in Pants is more informed about what’s a type or a package
w
(…public, presumably)
sorry, by “consuming side” i meant the consumer of symbols… the file that is importing or referencing the symbol
a
OK. Yes, when we parse consuming code, we can’t tell which is a package and which is a type, so we’d have to say everything is a
consumedType
, and Pants reconciles that with what it knows about the universe of packages and types
w
yep!
a
which includes for
java.util.function.Function
:
java
java.util
java.util.function
java.util.function.Function
w
hm, not quite? it’s declared symbols, right?
a
What do you mean?
w
are you talking about the file that declares
Function
?
a
No.
I am talking about writing code that consumes
Function
as a FQTN
w
ok. the code that consumes function will have a
consumed_type
of just
java.util.function.Function
(ideally)
a
Right. We don’t have the information to definitively infer that.
w
we’d then look it up in the firstparty mapping (with full symbol matches) and the thirdparty mapping with package/fuzzy mapping
@ancient-vegetable-10556: ah. you mean that
consumed_type
would need to become
consumed_symbol
instead…?
a
Yes. We only have enough information to say that
java
is a name, and
java.util
.
java.util.function
,
java.util.function.Function
are _field_s of their respective parent scopes
w
that, and it’s maybe partially qualified, as handled in https://github.com/pantsbuild/pants/pull/13557
a
so we have concepts of scopes, rather than types, packages, or classes
w
and 
java.util
java.util.function
 , 
java.util.function.Function
 are _field_s of their respective parent scopes
yea, sure. but calling the whole thing (
java.util.function.Function
) a symbol is accurate i think. and that’s what we do for scala
as is
java.util.function.Function.apply
or whatever other methods
a
java.util.function.Function.identity().apply("Hello, World!");
You can get to
java.util.Function
and say that’s a consumed type, probably. If it were inside a nested type, things get more complicated
w
well, you can: or you can extract the symbol
java.util.function.Function.identity
a
we don’t need to
w
Tom’s in a meeting, but he’ll probably be able to more accurately discuss this… i’ll let him 😃
a
because
java.util.function.Function.identity
doesn’t describe anything we can infer. But
j.u.f.F
is something we can infer. If there were a nested class in there, like your enum case, things get considerably more complicated
because we can’t tell the difference between a field of a class or a field of a package
w
a
that doesn’t help on the consumption side though?
w
depends what you can extract on the consumption side… maybe. i don’t know. in scala we’re able to extract it on the consumption side, afaik.
f
stu has an idea of looking at capitalization as a heuristic to infer what is a symbol to consider
w
i guess that between 13741, 13745, and 13750, the difference is that the latter two
import
types and symbols (respectively), while the first one is about FQT+method without an import statement.
a
It’s all the same — the issue is about determining what’s a relevant field access to output
f
so with
java.util.function.Function.identity
, split on
.
and then work in reverse and strip off components until reach the first one that starts with a capital
a
@fast-nail-55400 that is my plan, it will only work where classes aren’t nested
f
hmm, in that case, you’d have multiple components that starting with capitals.
w
wait wait wait. if you manage to extract
java.util.function.Function.identity
, then you’re golden, right?
that is a symbol. so if you are also producing all declared symbols, you have that precise thing in your mapping
f
so with
org.pantsbuild.example.Outer.Inner.apply
, maybe generate
org.pantsbuild.example.Outer.Inner
and
org.pantsbuild.example.Outer
as symbols
a
@witty-crayon-22786 we can only produce declared symbols for first party code
f
basically all prefixes where the last component of the prefix starts with a capital
a
which is next to useless
w
@ancient-vegetable-10556: we only need to… thirdparty mappings are wildcarded into packages
a
BUT WE DON’T KNOW WHAT IS A PACKAGE
w
org.pantsbuild.**
matches
org.pantsbuild.example.Outer.Inner.apply
so you don’t need to know.
a
I suspect this is going to break things for partially qualified types/static accesses, but I think I have enough to go off from Tom
w
what’s the plan then…?
a
w
that’s the opposite of what we do on the scala side, which is a bit odd… on the scala side we produce more symbols, and then the consumer consumes the precise symbol
but if you two are in agreement, then ok.
f
or we could just do that more symbols thing. frankly my brain is not recalling exactly what we do.
a
I think this one is a smaller change to get a win, whereas exporting more symbols will require a complete rethink of the parser
f
whatever works though!
w
keep in mind also that with:
with 
org.pantsbuild.example.Outer.Inner.apply
, maybe generate 
org.pantsbuild.example.Outer.Inner
  and 
org.pantsbuild.example.Outer
 as symbols
…you’re end up multiplying by the number of wildcard imports slash the possibility that something isn’t fully qualified
a
at the moment we only export “Top level types”. Exporting more symbols will deliver more wins with nested types and such, but it’s complicated
f
at the moment we only export “Top level types”. Exporting more symbols will deliver more wins with nested types and such, but it’s complicated
whereas with Scala, I ended up writing it so nested types are exported as “provided symbols” — something that I don’t think I did with Java.
☝️ 1
w
mm, one more point: that won’t fix https://github.com/pantsbuild/pants/issues/13750 unless we also adjust scala to use a truncating heuristic (whereas exposing symbols consistently in both parsers would fix all three)
f
(mostly because I was learning what to do with the problem as I went along)
a
So yes, I understand what scala is doing, it’s just a big pile of java implementation for something we consider 90%
I’m happy to do it, but the heursistic Tom suggests fits in well with the existing structure
w
then we should consider punting on the whole cluster perhaps… each of these three tickets was a “one instance” case
and we can mash them together into a single ticket to do later
because meanwhile we do have things that we can tackle for M2
a
OK, that makes sense. I’ll create a parent ticket
https://github.com/pantsbuild/pants/issues/13765 <--- Here’s that ticket. I figured one or two new cases that would benefit from symbol exports, which is handy.
👍 1
@witty-crayon-22786 Is the suggestion here that I should start looking through M2 items?
w
yea, i think so? we can revisit these if we’re seeing them even more in our upcoming test repos
sorry, was in a meeting
a
Makes sense. None of the dependencies there are particularly perverse, so it makes sense to do them at some point, but I would like more repro cases to emerge first
👍 1