I ve just had a chat with < fast nail 55400> about Java depe Pants #development

I’ve just had a chat with <@U0N6C2Q9F> about Java ...

ancient-vegetable-10556

09/20/2021, 6:16 PM

I’ve just had a chat with @fast-nail-55400 about Java dependency inference, and I think we’ve got a way forward on inference for fully-qualified type names (FQTNs); I’ll be documenting in the thread

ancient-vegetable-10556

09/20/2021, 6:34 PM

Problem: FQTNs are easily detectable by regexp, but are ambiguous with long dereference chains, e.g.

Copy code

public void beep(ToolSetup com) {
  System.out.println(com.toolchain.you.Get.the.picture);
}

You can figure out if

com.toolchain.you.Get

is a real type if you know the real list of classes that are available on the classpath, but Pants’ approach is to not specify the entire dependency list, especially if we’re going to make Fat JARs. Proposed Solution: When resolving the lockfile for a project, we walk the

zip

index of each dependency JAR we pull down. At this point, we can note the packages that are included with each locked JAR, and save that to a metadata file. At dependency scan time, we can search for strings of the form

x.y…z

and look for prefixes that match a known package name, and include the JARs that provide that package on the classpath. This will produce an overspecified classpath (if there are package prefixes that are ambiguous with variable names, we may specify more dependency classes than are actually there), but AIUI, overspecifying a classpath is not a correctness issue, but a file size issue. Any thoughts before I commit this to a GH issue? @witty-crayon-22786 @bored-art-40741

witty-crayon-22786

09/20/2021, 6:46 PM

using either the classpath or JDK 9 modules to do this sounds reasonable, yea. i expect that Patrick had already put some thought into it based on the discussion in https://pantsbuild.slack.com/archives/C0D7TNJHL/p1631728289220300

witty-crayon-22786

09/20/2021, 6:48 PM

as mentioned there: i think that “relative imports” / FQTs are a lower priority than code that actually imports things, and would be totally fine not catching inline imports in a first version if we agree that it encourages good code hygiene to prefer imports

ancient-vegetable-10556

09/20/2021, 6:49 PM

I think the key observation is that we’re probably fine with an overspecified dependency spec as long as it captures everything

witty-crayon-22786

09/20/2021, 6:50 PM

we’re probably fine with an overspecified dependency spec as long as it captures everything

hmm… what do you mean by that?

witty-crayon-22786

09/20/2021, 6:51 PM

oh, got it. referring to your comment

witty-crayon-22786

09/20/2021, 6:52 PM

when in doubt we should bias toward under specified: although we do have syntax for excluding dependencies that have been inferred, i think that it is much better to get an error from a compiler that something isn’t present, then no warning at all that you are pulling in something that you don’t actually depend on

witty-crayon-22786

09/20/2021, 6:54 PM

i think that it’s really important to minimize magic, and avoiding approaches that have false positives is an important part of that

➕ 1

bored-art-40741

09/20/2021, 10:02 PM

I agree with Stu: false positives are much worse than false negatives

➕ 1

bored-art-40741

09/20/2021, 10:03 PM

Explicit dependencies provide an easy to understand escape hatch for fixing a dep that dep inference missed. There is no such escape hatch for fixing a dep that inference incorrectly inferred, and it's really difficult to even detect such a situation

witty-crayon-22786

09/20/2021, 10:04 PM

There is no such escape hatch for fixing a dep that inference incorrectly inferred

there is: we support a

syntax in the dependency list. but the other part is true: it’s hard to detect.

ancient-vegetable-10556

09/20/2021, 10:05 PM

Right, the hard to detect bit is definitely a thing.

witty-crayon-22786

09/20/2021, 10:05 PM

but yea: agreed.

bored-art-40741

09/20/2021, 10:05 PM

Also, those false positives in my experience have a tendency to go off into really extreme directions really fast (e.g. pulling in absolutely everything as a dependency), and in that scenario it's actually pretty damaging (best case situation is you just have unnecessarily giant deploy jars)

👍 1

bored-art-40741

09/20/2021, 10:05 PM

In that scenario too, even the

escape hatch is infeasible

bored-art-40741

09/20/2021, 10:06 PM

There are many more dependencies in the world that you might need to explicitly cut than there are dependencies that you might need to explicitly add

👍 1

bored-art-40741

09/20/2021, 10:09 PM

I'm also leaning strongly toward ignoring inline type references that aren't explicitly imported, and potentially heuristically ignoring imports that appear to be "implicit" (e.g.

import Foo.blah

where it looks like it's probably importing from a type already in scope based on the name)

bored-art-40741

09/20/2021, 10:10 PM

There's a bunch of incremental work we could do with phased parse rounds where we add in known immediate dependencies and sources from the same package to get a really strong handle on implicit imports, but I think it's worth seeing how much demand there really is for that (weighed against the difficulty of just adding some explicit deps)

👍 1

ancient-vegetable-10556

09/20/2021, 10:10 PM

So the one concern I have is with the legitimate case of importing two classes with the same name from different packages; I’ll get off this particular horse if we can come up with a good way find errors and tell people how to fix them

bored-art-40741

09/20/2021, 10:12 PM

Use the FQT inline and add an explicit dep

witty-crayon-22786

09/20/2021, 10:12 PM

right. the compiler will tell you

bored-art-40741

09/20/2021, 10:12 PM

It's the same solution you'd otherwise use, and probably already have used, you just have to help dep inference a little

ancient-vegetable-10556

09/20/2021, 10:13 PM

The compiler will tell you that it’s missing, but it’s the compiler and not pants that’s giving you the error, but the fix lives in Pants

witty-crayon-22786

09/20/2021, 10:13 PM

yes, but this is a case where we degrade to “the status quo”

witty-crayon-22786

09/20/2021, 10:13 PM

this isn’t a new failure mode: folks who use build systems know this one.

witty-crayon-22786

09/20/2021, 10:14 PM

…which i guess is the whole reasoning behind preferring false negatives: the consequences are relatively low.

bored-art-40741

09/20/2021, 10:15 PM

To be clear, I think there's a solid path forward for narrowly supporting the common case of an inline FQT using the existing infrastructure and just a bit of extra logic. I just think we should draw the line with what we have now because that doesn't seem as high priority as other items

👍 1

ancient-vegetable-10556

09/20/2021, 10:15 PM

fair enough

bored-art-40741

09/20/2021, 10:15 PM

Basically, if you can unambiguously tell that a given reference looks exactly like a FQT (and not just a package) that is exported by another source, then you go ahead and add that dep too

bored-art-40741

09/20/2021, 10:16 PM

That's much more likely now that I have full type export working in the existing code

ancient-vegetable-10556

09/20/2021, 10:17 PM

I think the ambiguity thing is interesting; I feel like most cases where there’ll be ambiguity between an FQT and a package name or dereference chain will tend to be degenerate, given how consistently Java people name things

bored-art-40741

09/20/2021, 10:18 PM

I agree, but degenerate cases have a tendency to ruin your day when the failure mode isn't graceful

bored-art-40741

09/20/2021, 10:18 PM

I think it's doable, just harder, and probably should be introduced as an experimental option with lots of instrumentation

ancient-vegetable-10556

09/20/2021, 10:18 PM

yeah sure

bored-art-40741

09/20/2021, 10:18 PM

For that matter, making sure there's good instrumentation for users for the existing dep inference I'd argue is a higher priority on its own

bored-art-40741

09/20/2021, 10:19 PM

I'm actually more worried right now about the sort of opposite scenario that I expect to be more common and more difficult to detect: unqualified references to types in the same package

bored-art-40741

09/20/2021, 10:20 PM

The best I have now is: you should still import those for hygiene reasons, and because we need that import for fine grained dep analysis

bored-art-40741

09/20/2021, 10:20 PM

Or perhaps we should have a flag that automatically hairballs all package sources together for people who really want implicit access to types in the same package

ancient-vegetable-10556

09/20/2021, 10:21 PM

Is there a world in which we don’t include all the types in the current level of a given package?

bored-art-40741

09/20/2021, 10:22 PM

Sorry, I don't follow the question

bored-art-40741

09/20/2021, 10:22 PM

"current level"?

ancient-vegetable-10556

09/20/2021, 10:23 PM

Oh. If I have package

com.foo

, I can’t implicitly refer to types in

com.foo.bar

without importing them, right?

bored-art-40741

09/20/2021, 10:23 PM

Correct, it's best to ignore the apparently hierarchy in Java packages. It's a lie

ancient-vegetable-10556

09/20/2021, 10:24 PM

Yeah, I wasn’t sure if “package” was sufficiently descriptive as a term

bored-art-40741

09/20/2021, 10:24 PM

I'm talking about this scenario

ancient-vegetable-10556

09/20/2021, 10:24 PM

yeah, so is there a world in which we don’t pull in an entire package?

bored-art-40741

09/20/2021, 10:24 PM

Yeah, that's what I have written right now

bored-art-40741

09/20/2021, 10:24 PM

What I have now is type-level inference

bored-art-40741

09/20/2021, 10:24 PM

So it'll depend on fine-grained sources within a package

bored-art-40741

09/20/2021, 10:25 PM

Although it also has the analysis available to do package-level dependencies if we want

bored-art-40741

09/20/2021, 10:25 PM

And we're still going to need to special case tests somehow there

ancient-vegetable-10556

09/20/2021, 10:25 PM

bored-art-40741

09/20/2021, 10:25 PM

So, you have `B.java`:

Copy code

package org.pantsbuild.example;

public class B {}

class C {}

bored-art-40741

09/20/2021, 10:26 PM

and you have `A.java`:

Copy code

package org.pantsbuild.example;

import org.pantsbuild.example.C;

public class A {
	public static void main(String[] args) throws Exception {
		C c = new C();
	}
}

bored-art-40741

09/20/2021, 10:26 PM

These both compile. Note that A is referring to a package private type in B.java

bored-art-40741

09/20/2021, 10:26 PM

And in fact you don't even need to import C there

ancient-vegetable-10556

09/20/2021, 10:26 PM

Right

bored-art-40741

09/20/2021, 10:26 PM

But if you do, the current analysis does the Right Thing, and A.java depends on B.java but not vice versa

bored-art-40741

09/20/2021, 10:27 PM

I think this mostly solves the test issue organically, if you're willing to import types from the same package in your test

ancient-vegetable-10556

09/20/2021, 10:29 PM

I don’t recall if that’s how I’d normally have written java code

bored-art-40741

09/20/2021, 10:29 PM

I'm pretty sure I've written more Java for analyzing Java source than anything else, so same

bored-art-40741

09/20/2021, 10:30 PM

My first pass at this treated Java packages as hierarchical, and while I got a pretty clean implementation, I realized the false positive failure modes were too bad

bored-art-40741

09/20/2021, 10:31 PM

e.g. if I saw an import for

foo.bar.Baz

but nothing had the package

foo.bar

, I'd end up depending on everything that provided a package with a prefix of

foo. ...

bored-art-40741

09/20/2021, 10:31 PM

Which when applied to

com

org

is disasterous

bored-art-40741

09/20/2021, 10:32 PM

The new approach is better, and it allows for the sort of "monolithic package" approach you're suggesting if we want to go that route. In fact, it allows for that approach as a user-configurable option, but I haven't implemented that

Open in Slack

Previous Next