I found this JAR managing utility in the Pants 1 tree as I w Pants #development

I found this JAR-managing utility in the Pants 1 t...

ancient-vegetable-10556

09/23/2021, 6:21 PM

I found this JAR-managing utility in the Pants 1 tree as I was looking for prior art on fat JARs. Can someone help me understand how it fit into Pants 1? (https://github.com/pantsbuild/pants/tree/1.30.x/src/java/org/pantsbuild/tools/jar)

enough-analyst-54434

09/23/2021, 6:29 PM

IIRC this was for creating fat jars (deploy jars / executable jars). The main trick that was important for large binaries was contributed by Eric Ayers IIRC from Square and is here: https://github.com/pantsbuild/pants/blob/e5987065372b1a617fc13ae53b0cab5ef9bbf098/src/java/org/pantsbuild/tools/jar/JarBuilder.java#L1043-L1045

enough-analyst-54434

09/23/2021, 6:31 PM

The other tool that will be import is the jar shader which we used for shading tool jars. That, though used a clone / fork of jarjar which ran on an old version of ASM. So that would need a bit of work to resurrect.

enough-analyst-54434

09/23/2021, 6:33 PM

In particular, tools like the junit runner / scalatest runner, etc. This was needed to avoid classpath conflicts with the underlying code being tested, linted etc. Lots of mismatched deps, a common one being Guava. Tool ran on X, code used Y and there was no passing over API boundaries; so no need for the conflict -> thus shading did the trick.

ancient-vegetable-10556

09/23/2021, 6:33 PM

@enough-analyst-54434 the good news is that I’m actually looking into fat JARs at the moment

enough-analyst-54434

09/23/2021, 6:34 PM

OK. Yeah, we probably can / should re-surrect the java tools wholesale. I think there were just 3, junit runner, fatjar maker, shader.

ancient-vegetable-10556

09/23/2021, 6:34 PM

I think our aim was to at least get a tool in place in pants 2, and focus on shading when we see the need

enough-analyst-54434

09/23/2021, 6:34 PM

Although , I did see junit 5 fly by, so maybe the junit runner not needed.

ancient-vegetable-10556

09/23/2021, 6:35 PM

yeah, we’re using JUnit 5's console runner at the moment in P2, but we obviously haven’t tested that in the extreme yet

enough-analyst-54434

09/23/2021, 6:35 PM

IIRC that tool just offered parallelism which we should get from the v2 engine just fine.

ancient-vegetable-10556

09/23/2021, 6:36 PM

The Fat JAR maker is definitely something I don’t see a lot of obvious prior art for, but I haven’t been able to find the corresponding hooks into pants itself (mostly due to not knowing how Pants 1 is laid out/github’s code search being useless on code branches)

enough-analyst-54434

09/23/2021, 6:36 PM

Yeah, I suspect the shading need will be about a couple months after people actually start using things. That was how long it took to hit classpath conflicts in the past.

enough-analyst-54434

09/23/2021, 6:37 PM

Well the hooks probably won't be too relevant since it was just a main to run even back then. So the hooks are the CLI args it supports.

enough-analyst-54434

09/23/2021, 6:38 PM

Its out on maven central, so you can run it quick and see. maybe even use as-is?

ancient-vegetable-10556

09/23/2021, 6:38 PM

oh huh

enough-analyst-54434

09/23/2021, 6:38 PM

Pants dogfooded itself to publish its java tools.

enough-analyst-54434

09/23/2021, 6:40 PM

https://search.maven.org/search?q=a:jar-tool

ancient-vegetable-10556

09/23/2021, 6:40 PM

I see

org.pantsbuild.jarjar

ancient-vegetable-10556

09/23/2021, 6:40 PM

Oh huh!

ancient-vegetable-10556

09/23/2021, 6:40 PM

well that’s handy

enough-analyst-54434

09/23/2021, 6:42 PM

The jarjar will almost certainly not work since it can only handle X bytecode, but worth a spin I guess.

ancient-vegetable-10556

09/23/2021, 6:42 PM

witty-crayon-22786

09/23/2021, 6:42 PM

also, fwiw: i never saw a side-by-side comparison of the performance difference of the optimization in https://github.com/pantsbuild/pants/blob/e5987065372b1a617fc13ae53b0cab5ef9bbf098/src/java/org/pantsbuild/tools/jar/JarBuilder.java#L1043-L1045 … but apparently

zip

has native support for concatenation:

Copy code

cat input.zip.* > temp.zip
zip -FF temp.zip --out full.zip

…and it would be nice to do something dumb until we know we need the custom code

enough-analyst-54434

09/23/2021, 6:42 PM

Can't remember the version

ancient-vegetable-10556

09/23/2021, 6:43 PM

oh right

ancient-vegetable-10556

09/23/2021, 6:43 PM

that makes sense

witty-crayon-22786

09/23/2021, 6:43 PM

(an issue with the concatenation approach in the medium term is that it wouldn’t support anything more clever than “last file with a particular name wins”… which is generally a fine default)

ancient-vegetable-10556

09/23/2021, 6:44 PM

thanks @witty-crayon-22786! Yeah, concatenation is a thing, as long as there’s no class name conflicts in there

enough-analyst-54434

09/23/2021, 6:44 PM

Although, to be fair, that was deemed not a fine silent default at Twitter where alot of work went into logging duplicates warnings and etc.

witty-crayon-22786

09/23/2021, 6:44 PM

well, even then: last item wins is fine in general. it’s nice to be able to warn/error for it though

ancient-vegetable-10556

09/23/2021, 6:45 PM

Right, I suspect that’s what we’ll do until we support shading

enough-analyst-54434

09/23/2021, 6:45 PM

We never supported shading for this purpose, fat jars, just for tools as a whole.

witty-crayon-22786

09/23/2021, 6:45 PM

but warning/erroring during concatenation doesn’t really make the most sense to me, since the conflict is a potential issue anytime you consume that classpath, not just when building a fat jar.

ancient-vegetable-10556

09/23/2021, 6:46 PM

I’ve been looking around the internets to see what the state of fat jar assembly has been, and there’s been a bunch of “make jars that contain jars” suggestions

ancient-vegetable-10556

09/23/2021, 6:46 PM

so seeing that we have an in-house tool seemed handy

enough-analyst-54434

09/23/2021, 6:47 PM

If the tools we wrote then don't make sense now, we were deluded then ... ~roughly. So a few hours at least working the olds tools hard seems in order.

witty-crayon-22786

09/23/2021, 6:48 PM

@enough-analyst-54434: yea, worth trying it out probably. lots of stuff lands without benchmarks.

witty-crayon-22786

09/23/2021, 6:49 PM

i’m not sure whether resuming maintaining our own forks of JVM tools is inevitable, but… it would be great not to if we don’t need to.

enough-analyst-54434

09/23/2021, 6:50 PM

We forked 1 jvm tool - jarjar (it acually died). The other 2 were tools that did not exist except buried in other build tools, like maven.

enough-analyst-54434

09/23/2021, 6:50 PM

But agreed - clearly we all agree I'd hope for all tools in all languages.

enough-analyst-54434

09/23/2021, 7:01 PM

This is the most useful place to look at whys: https://github.com/twitter-archive/commons/blame/4f26f742c997c64758d172aa203873b105d13860/src/java/com/twitter/common/jar/tool/JarBuilder.java That reminded me that CONCAT was a thing for service files, which was common enough. So you can't pick a serivce file, you have to merge them to get all the registered services provided by your N jars for any code using JDK services for plugins.

ancient-vegetable-10556

09/23/2021, 7:02 PM

that makes sense

enough-analyst-54434

09/23/2021, 7:02 PM

Here are the perfs from Eric: https://github.com/twitter-archive/commons/commit/3b9d3d01cf883d09e29e2d82a80d2f12add8c83d

enough-analyst-54434

09/23/2021, 7:02 PM

Its massive.

witty-crayon-22786

09/23/2021, 7:13 PM

Wow, yea. Comparing to

zip -FF

would be the most interesting bit... because if the -FF pass is mostly copying and just appending a new index, it could be pretty snappy too.

witty-crayon-22786

09/23/2021, 7:13 PM

Although how soon we need CONCAT is also a factor.

fast-nail-55400

09/23/2021, 7:18 PM

I’m catching up. We should punt on jar shading until some months from now. It doesn’t need to be in v1 of fat jar packaging.

fast-nail-55400

09/23/2021, 7:19 PM

the same with some of the other v1 features: custom manifest files etc. except to the extent needed to be able to use the packaged fat jar

bored-art-40741

09/23/2021, 11:39 PM

Hey, we weren't deluded back then! It's not like any of us were crazy enough to write a custom classloader, right?

🎢 1

ancient-vegetable-10556

09/24/2021, 4:07 PM

NARRATOR: …………

ancient-vegetable-10556

09/24/2021, 4:07 PM

https://giphy.com/gifs/G4rIGiMVtrJ1S

ancient-vegetable-10556

09/24/2021, 8:59 PM

OK, I was able to make the

jartool

successfully produce a fat JAR on the command line for a project with nontrivial dependencies

witty-crayon-22786

09/24/2021, 9:01 PM

nice. if you have time to compare to

zip -FF

, that would be handy. because i can imagine how to cobble this together (even CONCAT) purely with unix tools, and it wouldn’t really be that bad

ancient-vegetable-10556

09/24/2021, 9:07 PM

one moment

ancient-vegetable-10556

09/24/2021, 9:15 PM

@witty-crayon-22786 `cat`ting together all of the java files and then running the unix

zip -FF

tool worked just fine. The

jar

that Gradle popped out didn’t include a

main

attribute in the manifest, so the jar was’t runnable, but it was possible to specify the fat jar on the classpath and then invoke the

main

by name

ancient-vegetable-10556

09/24/2021, 9:17 PM

For real-world use, we’d want to test for cases where there are filename collisions

ancient-vegetable-10556

09/24/2021, 9:18 PM

(I remember having

zip

files that span multiple floppy disks, so re-using this functionality is mildly amusing)

witty-crayon-22786

09/24/2021, 9:18 PM

Benchmarking them side by side might be good, but might be challenging without nailgun

ancient-vegetable-10556

09/24/2021, 9:20 PM

My guess is that

cat; zip

will be faster, if only because it doesn’t recompress — it’s just reading files and outputting a new, correct index at the end of the file

witty-crayon-22786

09/24/2021, 9:20 PM

the point of the

jartool

optimization at the head of this thread was to do the exact same thing, i think

ancient-vegetable-10556

09/24/2021, 9:21 PM

oh right

ancient-vegetable-10556

09/24/2021, 9:27 PM

So do we have a preference absent benchmarking? The existing

jartool

almost certainly has the better per-file behaviour;

zip -FF

has a quite verbose output which doesn’t seem to detect collisions

witty-crayon-22786

09/24/2021, 9:28 PM

um… i think that my hope is that we don’t have to maintain custom JVM code, although as mentioned above, it may be inevitable.

witty-crayon-22786

09/24/2021, 9:30 PM

i could imagine landing a first version that used

zip -FF

, with a note on switching to

jartool

if needed… @fast-nail-55400 likely has the best sense of what we need in a first version.

ancient-vegetable-10556

09/24/2021, 9:31 PM

reasonable

witty-crayon-22786

09/24/2021, 9:32 PM

(but CONCAT support could be done by literally extracting all copies of the colliding file into a new file, and tacking that on in a single-entry zip to the

zip -FF

)

ancient-vegetable-10556

09/24/2021, 9:32 PM

that makes sense

ancient-vegetable-10556

09/24/2021, 9:33 PM

as long as we have the zip indices in advance to know where the clashes are (which we can do in Python!)

witty-crayon-22786

09/24/2021, 9:34 PM

re: Python: sortof… not without slurping the file into memory. but you could do it in the sandbox with a python process. probably better to do it with

unzip $file $innerfile

ancient-vegetable-10556

09/24/2021, 9:35 PM

Python’s

zipfile

module doesn’t seek to the end of the file like a normal zip utility?

witty-crayon-22786

09/24/2021, 9:36 PM

`@rule`s don’t have access to loose files on disk… because rather than being loose, they’re in a database.

ancient-vegetable-10556

09/24/2021, 9:48 PM

oh sure, and I presume it’s difficult to seek through those files in a random-access fashion?

ancient-vegetable-10556

09/24/2021, 9:48 PM

(zip indices are pretty easy to spot)

witty-crayon-22786

09/24/2021, 9:51 PM

sortof? stuff will be sequential in the DB. it’s more that `@rule`s are not intended to directly access large files. you can load files into memory with