< bored art 40741> hey what was the theory on falling back f Pants #development

<@U053HNE90>: hey! what was the theory on falling ...

witty-crayon-22786

10/21/2021, 4:55 AM

@bored-art-40741: hey! what was the theory on falling back from types to packages here: https://github.com/pantsbuild/pants/blob/feae4861a17ff3ce0d9e85ffe808810bbe2b4786/[…]/pants/backend/java/dependency_inference/package_prefix_tree.py

witty-crayon-22786

10/21/2021, 5:00 AM

is that based on some aspect of how imports are usually resolved that i’m missing? because it seems like the impact is that if we don’t find an exact type match, we’ll match ~any file that exposes a package… and that’s pretty fuzzy

witty-crayon-22786

10/21/2021, 5:00 AM

cc @fast-nail-55400

witty-crayon-22786

10/21/2021, 6:51 PM

@fast-nail-55400: hm, yea: i suspect that we should stop doing this. you commented to that effect in https://github.com/pantsbuild/pants/pull/13242, but in larger repos it ends up pulling in the wrong things.

bored-art-40741

10/21/2021, 9:53 PM

The type -> package fallback is at least partially to support wildcard imports

bored-art-40741

10/21/2021, 9:54 PM

The idea was: • If you find an exact type match, use that as a source-level dep. • If not, but you see an exact package match, depend on all sources that export that package • Do NOT treat packages hierarchical, i.e. don't depend on all suffixes of a given package. Package names are treated as being strictly opaque strings.

bored-art-40741

10/21/2021, 9:55 PM

It was also designed without FQT reference matching in mind, i.e. it assumed you knew for sure the FQT you were matching, as opposed to checking if maybe an unqualified reference is from the same package, or a wildcard import, etc.

fast-nail-55400

10/21/2021, 9:56 PM

whereas the code I wrote for unqualified types tries to see if the types originate in a particular package

bored-art-40741

10/21/2021, 9:56 PM

Yeah. Now, that data structure was designed with this in mind, I just didn't add an interface for it

bored-art-40741

10/21/2021, 9:56 PM

Or rather, IIRC, I did but then I deleted it before pushing because it was unused

bored-art-40741

10/21/2021, 9:57 PM

But you can easily expose an interface to check the underlying type map for a match, and those matches will always be exact

witty-crayon-22786

10/21/2021, 10:08 PM

Do NOT treat packages hierarchical, i.e. don’t depend on all suffixes of a given package. Package names are treated as being strictly opaque strings.

isn’t this the impact of recursively walking upward though?

bored-art-40741

10/21/2021, 10:09 PM

That's only on the symbol being interpreted; the exported symbols (and packages) have no notion of hierarchy

witty-crayon-22786

10/21/2021, 10:41 PM

ok, i think i see where this is going.

witty-crayon-22786

10/21/2021, 10:42 PM

but i don’t think it ever makes sense to do this for first-party symbols, because we know their exact declared types.

witty-crayon-22786

10/21/2021, 10:42 PM

so first party code should only ever do an exact type match.

witty-crayon-22786

10/21/2021, 10:43 PM

but we might want to do it for 3rdparty code, since we will likely define our 3rdparty module mapping to only record/expose packages, rather than types

witty-crayon-22786

10/21/2021, 10:43 PM

@fast-nail-55400: so should i remove the package search in #13319, or should we make it dependent on how something is added to the mapping whether it is included as a package, or as a precise type?

witty-crayon-22786

10/21/2021, 10:45 PM

(…i think that i already know the answer to that.)

fast-nail-55400

10/21/2021, 10:45 PM

my current thinking for the third-party package mapping is just dictionary of package to artifact group/name (w/o version), similar to what foursquare’s buildgen plugin did in v1

fast-nail-55400

10/21/2021, 10:45 PM

then pants can suggest adding a jvm_artifact if none exists or infer an existing jvm_artifact that matches

witty-crayon-22786

10/21/2021, 10:45 PM

yea. so i think that the mapping should allow for adding packages, but only explicitly, rather than implicitly when a precise type is added.

fast-nail-55400

10/21/2021, 10:46 PM

the mapping would initially be packages only and would be explicit

witty-crayon-22786

10/21/2021, 10:46 PM

then pants can suggest adding a jvm_artifact if none exists or infer an existing jvm_artifact that matches

definitely the latter bit… we don’t do the former for Python, i think because we can’t precisely determine which things are part of the stdlib.

👍 1

witty-crayon-22786

10/21/2021, 10:47 PM

@fast-nail-55400: ok. i’ll adjust https://github.com/pantsbuild/pants/pull/13319 to make package use explicit when things are added.

fast-nail-55400

10/21/2021, 10:47 PM

the foursquare mapping data looks like a trie and let’s certain package paths be skipped

fast-nail-55400

10/21/2021, 10:48 PM

(a similar feature might help with stdlib inference since we could skip the relevant parts of javax that are stdlib for example)

witty-crayon-22786

10/21/2021, 10:51 PM

@fast-nail-55400: so, regarding `addresses_for_exact_symbol`: i think that we do actually want to match packages there, as long as packages have only been declared for code where we don’t have exact symbol data

fast-nail-55400

10/21/2021, 10:52 PM

like for a wildcard import from a first-party package?

witty-crayon-22786

10/21/2021, 10:52 PM

witty-crayon-22786

10/21/2021, 10:52 PM

for thirdparty code

fast-nail-55400

10/21/2021, 10:52 PM

the mapping structure is (currently) only for first-party symbols though

witty-crayon-22786

10/21/2021, 10:53 PM

shouldn’t need to be

witty-crayon-22786

10/21/2021, 10:53 PM

one sec

witty-crayon-22786

10/21/2021, 10:53 PM

the PR edit:

This change also adjusts
PackageRootedDependencyMap
so that `package`s are only added explicitly, as an alternative to precise types. Firstparty code should always add precise types, while thirdparty code will likely only add packages.

witty-crayon-22786

10/21/2021, 10:54 PM

so then looking up a precise type would only ever encounter a package if there was thirdparty code for that package

fast-nail-55400

10/21/2021, 10:54 PM

I thought we rejected fetching jars though. So there is no third-party analysis to fill in.

witty-crayon-22786

10/21/2021, 10:55 PM

https://pantsbuild.slack.com/archives/C0D7TNJHL/p1634856381033600?thread_ts=1634792153.000700&cid=C0D7TNJHL

witty-crayon-22786

10/21/2021, 10:55 PM

that is a mapping from package to address… so the same thing

fast-nail-55400

10/21/2021, 10:55 PM

yes as a constant like in

default_module_mapping.py

witty-crayon-22786

10/21/2021, 10:55 PM

sure. and the result is a mapping from package to Address

fast-nail-55400

10/21/2021, 10:55 PM

and it wouldn’t be carried in

PackageRootedDependencyMap

witty-crayon-22786

10/21/2021, 10:55 PM

why not?

fast-nail-55400

10/21/2021, 10:56 PM

it is a mapping trie from package path components to the Coordinate to use.

fast-nail-55400

10/21/2021, 10:56 PM

the derived data structure is going to be a ThirdPartyArtifactMapping

fast-nail-55400

10/21/2021, 10:56 PM

no need to overload PackageRootedDependencyMap

fast-nail-55400

10/21/2021, 10:57 PM

which is incorporated into FirstPartyJavaPackageMapping as the sole member

witty-crayon-22786

10/21/2021, 10:57 PM

how does that connect to jvm_artifact though?

witty-crayon-22786

10/21/2021, 10:57 PM

how this works on python is that each artifact declares its package mapping

fast-nail-55400

10/21/2021, 10:57 PM

the derived data structure is going to be a ThirdPartyArtifactMapping

that’s exactly what will be derived

witty-crayon-22786

10/21/2021, 10:57 PM

and the default mapping is what is used when it isn’t specified

fast-nail-55400

10/21/2021, 10:58 PM

this is judgment call and I choose separate data structure

fast-nail-55400

10/21/2021, 10:58 PM

then the third-party inference rule will not depend on any of the first-party code

witty-crayon-22786

10/21/2021, 10:59 PM

you always have to compute both

witty-crayon-22786

10/21/2021, 10:59 PM

*shrug

fast-nail-55400

10/21/2021, 10:59 PM

then why make any of it configurable (i.e., able to turn off first-party inference)?

witty-crayon-22786

10/21/2021, 10:59 PM

alright. so you’re saying that i should remove packages from

PackageRootedDependencyMap

because you’re going to move that to a thirdparty only class?

witty-crayon-22786

10/21/2021, 11:00 PM

@fast-nail-55400: it’s not first party vs third party inference. it’s extracting consumed* symbols in various places

fast-nail-55400

10/21/2021, 11:00 PM

my point is, if you turn off first-party symbol inference, there is no need to calculate exported symbols from first-party code

witty-crayon-22786

10/21/2021, 11:01 PM

that’s not the granularity it would be turned off at though.

fast-nail-55400

10/21/2021, 11:01 PM

but yes, remove packages from PackageRootedDependencyMap and make that data structure first-party only

👍 1

witty-crayon-22786

10/21/2021, 11:01 PM

fast-nail-55400

10/21/2021, 11:02 PM

also separating the concerns seems more readable to me since each data structure will do one thing (and hopefully do it well)

fast-nail-55400

10/21/2021, 11:02 PM

I’ve already been a bit confused by figuring out what use cases PackageRootedDependencyMap covered

witty-crayon-22786

10/21/2021, 11:02 PM

for posterity: these are Python’s options: https://github.com/pantsbuild/pants/blob/b6e51a5aa9250b18b0297cb070189aec044067e1/src/python/pants/backend/python/dependency_inference/rules.py#[…]8 … they’re about where you extract a symbol to look for, rather than where to search for symbols (because not searching for symbols in thirdparty code or firstparty code wouldn’t be useful)

fast-nail-55400

10/21/2021, 11:03 PM

so my main motivation is my understanding of how it all works

witty-crayon-22786

10/21/2021, 11:04 PM

ok. https://github.com/pantsbuild/pants/pull/13319 is ready for review then

fast-nail-55400

10/21/2021, 11:17 PM

👀

bored-art-40741

10/21/2021, 11:20 PM

What’s your solution for first party wildcard imports?

witty-crayon-22786

10/21/2021, 11:21 PM

@bored-art-40741: we will have precise symbols exposed for the package behind the wildcard

witty-crayon-22786

10/21/2021, 11:21 PM

and Tom’s change in https://github.com/pantsbuild/pants/pull/13313 looks up precise symbols for FQTs

bored-art-40741

10/21/2021, 11:21 PM

Ah, and you’ll try to map unqualified symbols in the source to precise symbols?

witty-crayon-22786

10/21/2021, 11:22 PM

yea

bored-art-40741

10/21/2021, 11:22 PM

Cool

bored-art-40741

10/21/2021, 11:22 PM

And static/static-wildcard imports?

witty-crayon-22786

10/21/2021, 11:27 PM

good question… haven’t gotten there yet. but perhaps that is a separate symbol type that is looked up. the challenge with the current approach is that anything that pulls in an entire first party package will be ambiguous because we don’t know which file you’re depending on.

witty-crayon-22786

10/21/2021, 11:28 PM

so a wildcard needs special treatment compared to other lookups.

fast-nail-55400

10/21/2021, 11:29 PM

I imagine some type solving may help if we can attribute which symbols were actually referenced by the wildcard import

fast-nail-55400

10/21/2021, 11:29 PM

but I’ve been preferring to punt on type solving for now

witty-crayon-22786

10/21/2021, 11:29 PM

yea.

fast-nail-55400

10/21/2021, 11:30 PM

and for now we could recommend in docs that wildcard imports will lead to suboptimal results

fast-nail-55400

10/21/2021, 11:30 PM

at least for v1 of this

witty-crayon-22786

10/21/2021, 11:30 PM

yep.

bored-art-40741

10/21/2021, 11:33 PM

(These would be overspecified types, and deeply nested precise types, respectively)

2 Views

Open in Slack

Previous Next