<@U053HNE90>: hey! what was the theory on falling ...
# development
w
is that based on some aspect of how imports are usually resolved that i’m missing? because it seems like the impact is that if we don’t find an exact type match, we’ll match ~any file that exposes a package… and that’s pretty fuzzy
cc @fast-nail-55400
@fast-nail-55400: hm, yea: i suspect that we should stop doing this. you commented to that effect in https://github.com/pantsbuild/pants/pull/13242, but in larger repos it ends up pulling in the wrong things.
b
The type -> package fallback is at least partially to support wildcard imports
The idea was: • If you find an exact type match, use that as a source-level dep. • If not, but you see an exact package match, depend on all sources that export that package • Do NOT treat packages hierarchical, i.e. don't depend on all suffixes of a given package. Package names are treated as being strictly opaque strings.
It was also designed without FQT reference matching in mind, i.e. it assumed you knew for sure the FQT you were matching, as opposed to checking if maybe an unqualified reference is from the same package, or a wildcard import, etc.
f
whereas the code I wrote for unqualified types tries to see if the types originate in a particular package
b
Yeah. Now, that data structure was designed with this in mind, I just didn't add an interface for it
Or rather, IIRC, I did but then I deleted it before pushing because it was unused
But you can easily expose an interface to check the underlying type map for a match, and those matches will always be exact
w
Do NOT treat packages hierarchical, i.e. don’t depend on all suffixes of a given package.  Package names are treated as being strictly opaque strings.
isn’t this the impact of recursively walking upward though?
b
That's only on the symbol being interpreted; the exported symbols (and packages) have no notion of hierarchy
w
ok, i think i see where this is going.
but i don’t think it ever makes sense to do this for first-party symbols, because we know their exact declared types.
so first party code should only ever do an exact type match.
but we might want to do it for 3rdparty code, since we will likely define our 3rdparty module mapping to only record/expose packages, rather than types
@fast-nail-55400: so should i remove the package search in #13319, or should we make it dependent on how something is added to the mapping whether it is included as a package, or as a precise type?
(…i think that i already know the answer to that.)
f
my current thinking for the third-party package mapping is just dictionary of package to artifact group/name (w/o version), similar to what foursquare’s buildgen plugin did in v1
then pants can suggest adding a jvm_artifact if none exists or infer an existing jvm_artifact that matches
w
yea. so i think that the mapping should allow for adding packages, but only explicitly, rather than implicitly when a precise type is added.
f
the mapping would initially be packages only and would be explicit
w
then pants can suggest adding a jvm_artifact if none exists or infer an existing jvm_artifact that matches
definitely the latter bit… we don’t do the former for Python, i think because we can’t precisely determine which things are part of the stdlib.
👍 1
@fast-nail-55400: ok. i’ll adjust https://github.com/pantsbuild/pants/pull/13319 to make package use explicit when things are added.
f
the foursquare mapping data looks like a trie and let’s certain package paths be skipped
(a similar feature might help with stdlib inference since we could skip the relevant parts of javax that are stdlib for example)
w
@fast-nail-55400: so, regarding `addresses_for_exact_symbol`: i think that we do actually want to match packages there, as long as packages have only been declared for code where we don’t have exact symbol data
f
like for a wildcard import from a first-party package?
w
no
for thirdparty code
f
the mapping structure is (currently) only for first-party symbols though
w
shouldn’t need to be
one sec
the PR edit:
This change also adjusts
PackageRootedDependencyMap
so that `package`s are only added explicitly, as an alternative to precise types. Firstparty code should always add precise types, while thirdparty code will likely only add packages.
so then looking up a precise type would only ever encounter a package if there was thirdparty code for that package
f
I thought we rejected fetching jars though. So there is no third-party analysis to fill in.
that is a mapping from package to address… so the same thing
f
yes as a constant like in
default_module_mapping.py
w
sure. and the result is a mapping from package to Address
f
and it wouldn’t be carried in
PackageRootedDependencyMap
w
why not?
f
it is a mapping trie from package path components to the Coordinate to use.
the derived data structure is going to be a ThirdPartyArtifactMapping
no need to overload PackageRootedDependencyMap
which is incorporated into FirstPartyJavaPackageMapping as the sole member
w
how does that connect to jvm_artifact though?
how this works on python is that each artifact declares its package mapping
f
the derived data structure is going to be a ThirdPartyArtifactMapping
that’s exactly what will be derived
w
and the default mapping is what is used when it isn’t specified
f
this is judgment call and I choose separate data structure
then the third-party inference rule will not depend on any of the first-party code
w
you always have to compute both
*shrug
f
then why make any of it configurable (i.e., able to turn off first-party inference)?
w
alright. so you’re saying that i should remove packages from
PackageRootedDependencyMap
because you’re going to move that to a thirdparty only class?
@fast-nail-55400: it’s not first party vs third party inference. it’s extracting consumed* symbols in various places
f
my point is, if you turn off first-party symbol inference, there is no need to calculate exported symbols from first-party code
w
that’s not the granularity it would be turned off at though.
f
but yes, remove packages from PackageRootedDependencyMap and make that data structure first-party only
👍 1
w
k
f
also separating the concerns seems more readable to me since each data structure will do one thing (and hopefully do it well)
I’ve already been a bit confused by figuring out what use cases PackageRootedDependencyMap covered
w
for posterity: these are Python’s options: https://github.com/pantsbuild/pants/blob/b6e51a5aa9250b18b0297cb070189aec044067e1/src/python/pants/backend/python/dependency_inference/rules.py#[…]8 … they’re about where you extract a symbol to look for, rather than where to search for symbols (because not searching for symbols in thirdparty code or firstparty code wouldn’t be useful)
f
so my main motivation is my understanding of how it all works
w
f
👀
b
What’s your solution for first party wildcard imports?
w
@bored-art-40741: we will have precise symbols exposed for the package behind the wildcard
and Tom’s change in https://github.com/pantsbuild/pants/pull/13313 looks up precise symbols for FQTs
b
Ah, and you’ll try to map unqualified symbols in the source to precise symbols?
w
yea
b
Cool
And static/static-wildcard imports?
w
good question… haven’t gotten there yet. but perhaps that is a separate symbol type that is looked up. the challenge with the current approach is that anything that pulls in an entire first party package will be ambiguous because we don’t know which file you’re depending on.
so a wildcard needs special treatment compared to other lookups.
f
I imagine some type solving may help if we can attribute which symbols were actually referenced by the wildcard import
but I’ve been preferring to punt on type solving for now
w
yea.
f
and for now we could recommend in docs that wildcard imports will lead to suboptimal results
at least for v1 of this
w
yep.
b
(These would be overspecified types, and deeply nested precise types, respectively)