< ancient vegetable 10556> i expect that the native parser i Pants #development

<@U021C96KUGJ>: i expect that the native parser in...

witty-crayon-22786

09/15/2021, 5:51 PM

@ancient-vegetable-10556: i expect that the native parser in https://github.com/pantsbuild/pants/pull/12890 will be much faster

witty-crayon-22786

09/15/2021, 5:52 PM

to your question about matching against known packages on the classpath though: inference cannot consume the classpath

ancient-vegetable-10556

09/15/2021, 5:52 PM

Sure, I haven’t gone in and looked at what symbols that parser pulls out for (a) type declarations, and (b) method calls

witty-crayon-22786

09/15/2021, 5:52 PM

(except for 3rdparty deps, presumably)

ancient-vegetable-10556

09/15/2021, 5:53 PM

Knowing that we can’t consume the classpath, I feel like we can, at the very least, maintain a list of known package prefixes

witty-crayon-22786

09/15/2021, 5:57 PM

yea, that is true.

witty-crayon-22786

09/15/2021, 5:57 PM

https://github.com/pantsbuild/pants/pull/12890 relies on that to some degree by extracting

package

statements. i’m about to comment about it.

ancient-vegetable-10556

09/15/2021, 5:58 PM

If we have a list of known package prefixes, then matching FQTs against that should be possible to a certain extent

witty-crayon-22786

09/15/2021, 5:59 PM

possibly. after you’ve actually extracted them you can implement resolution.

witty-crayon-22786

09/15/2021, 5:59 PM

but… i’m also 100% fine with those not being inferred in the medium term.

ancient-vegetable-10556

09/15/2021, 5:59 PM

I belive that Maven metadata has a list of packages exported by a dependency… but if we have a lockfile in place, then we can resolve that lockfile and map out the packages in each dependency before doing source analysis

witty-crayon-22786

09/15/2021, 6:00 PM

I belive that Maven metadata has a list of packages exported by a dependency

yea, JDK9 module mappings do (too?)

witty-crayon-22786

09/15/2021, 6:02 PM

which might ease 3rdparty “export” extraction.

witty-crayon-22786

09/15/2021, 8:53 PM

cc @bored-art-40741 for later

bored-art-40741

09/16/2021, 12:42 AM

Yeah, so I've been thinking about this a bunch lately and have some fragments of opinions

bored-art-40741

09/16/2021, 12:43 AM

Or at least, a bunch of rakes I'm worried we might step on

bored-art-40741

09/16/2021, 12:49 AM

Just as an observation from spending a bunch of time trying to get all FQT references out of a single Java source: it's not easy. I believe at some point it's inherently ambiguous if you don't have the real classpath available. Prefix matching seems like it should do the job, but I also observed both Spoon and Javaparser giving me back the root prefix of a package as a potential symbol (when the actual usage was a FQT like

java.util.Date

or similar), which obviously isn't any good for dep inference, but it also isn't trivial to distinguish between that case and a "real" use like

somepackage.Foo

bored-art-40741

09/16/2021, 12:52 AM

There are also a lot of directions we can go on this in terms of how opinionated we are, what we consider to be the source of truth, whether we assume our dep inference is always the full truth, etc--and the right direction probably depends heavily on the specific types of users we're targeting. My general inclination is that there is way too much code out there doing weird stuff for us to ever claim 100% coverage with our dep analysis, and we're therefore always going to have to maintain an escape hatch in the form of explicitly provided dependencies. So we should also design against that and think about when a particular case is rare enough that it doesn't justify going down a deep rabbit-hole, and we instead point the user at explicitly provided deps. I'm kind of leaning right now toward doing that with FQTs

bored-art-40741

09/16/2021, 12:53 AM

Or possibly we can cover like 95% of FQTs and limit our failures to false negatives, which is even better (but we have to be careful to not generate false positives, which don't have an escape hatch as far as I'm aware)

bored-art-40741

09/16/2021, 12:55 AM

Another direction we could potentially go is to just declare modules as The Future ™️ and just say that's the cut: if you don't use modules, you need to manually manage your Java deps in Pants; if you do, we take care of it implicitly.

bored-art-40741

09/16/2021, 12:56 AM

I actually think this is a pretty viable approach and it means effort spent toiling on language specific parsing can instead be spent on fighting our common enemy: third party dep resolution

bored-art-40741

09/16/2021, 12:57 AM

The code that doesn't use modules will shrink over time. That said, I have no real sense of how widely adopted modules are in the Java ecosystem right now

bored-art-40741

09/16/2021, 1:13 AM

Another thing to keep in mind about dependency analysis that I keep having to remind myself: you always depend on the mapping of exported symbols to targets. That mapping is inherently global, so it needs to be computed quickly and cached aggressively

bored-art-40741

09/16/2021, 1:15 AM

Also, tests: IIRC, a common pattern in Java is for test source to have the same package as the code being tested. If we don't get down to the type/symbol level of dep analysis, we're going to need to special case the source/target type of test code to make sure it doesn't get yanked into a package hairball

witty-crayon-22786

09/16/2021, 1:43 AM

Re: tests: You don't think file-level is sufficient for that?

bored-art-40741

09/16/2021, 1:46 AM

Not if you hairball to package level for dependency analysis, right?

bored-art-40741

09/16/2021, 1:48 AM

Like if you have

A.java

B.java

, and

ATest.java

, all with

package foo.bar

, dep analysis with package-prefix level granularity will say that all 3 are in the same coarsened hairball

witty-crayon-22786

09/16/2021, 1:58 AM

My general inclination is that there is way too much code out there doing weird stuff for us to ever claim 100% coverage with our dep analysis, and we're therefore always going to have to maintain an escape hatch in the form of explicitly provided dependencies. So we should also design against that and think about when a particular case is rare enough that it doesn't justify going down a deep rabbit-hole, and we instead point the user at explicitly provided deps. I'm kind of leaning right now toward doing that with FQTs

Yeah explicitly provided deps are absolutely a valid way to avoid potentially ambiguous situations and we shouldn't be afraid of them.

witty-crayon-22786

09/16/2021, 2:00 AM

I'm less sure that modules are a panacea, although we should use them if they're available. If we can I'd sort of rather generate modules then require them for first party code.

witty-crayon-22786

09/16/2021, 2:01 AM

Having said that, explicitly specifying your exports is good and probably necessary.

witty-crayon-22786

09/16/2021, 2:02 AM

Re: tests again: I'm not sure what hairball means as a verb, heh. But I don't know why we would need to coarsen to the package level? Unless there was a legitimate cycle between library and test code...

bored-art-40741

09/16/2021, 2:04 AM

Well, we need to know which sources export which types, and how to properly infer FQTs (not just imports). Which is beyond what I've implemented so far, though the former is likely not too difficult

bored-art-40741

09/16/2021, 2:04 AM

So far, we only know which package a source declares

witty-crayon-22786

09/16/2021, 2:12 AM

got it. right.

witty-crayon-22786

09/16/2021, 2:21 AM

But yea: investing lots of time in relative imports is a lower priority than figuring out per-file declared types to avoid pulling in entire packages. Not to mention the fact that it should be much easier!

Open in Slack

Previous Next