https://pantsbuild.org/ logo
a

ancient-vegetable-10556

09/23/2021, 6:21 PM
I found this JAR-managing utility in the Pants 1 tree as I was looking for prior art on fat JARs. Can someone help me understand how it fit into Pants 1? (https://github.com/pantsbuild/pants/tree/1.30.x/src/java/org/pantsbuild/tools/jar)
e

enough-analyst-54434

09/23/2021, 6:29 PM
IIRC this was for creating fat jars (deploy jars / executable jars). The main trick that was important for large binaries was contributed by Eric Ayers IIRC from Square and is here: https://github.com/pantsbuild/pants/blob/e5987065372b1a617fc13ae53b0cab5ef9bbf098/src/java/org/pantsbuild/tools/jar/JarBuilder.java#L1043-L1045
The other tool that will be import is the jar shader which we used for shading tool jars. That, though used a clone / fork of jarjar which ran on an old version of ASM. So that would need a bit of work to resurrect.
In particular, tools like the junit runner / scalatest runner, etc. This was needed to avoid classpath conflicts with the underlying code being tested, linted etc. Lots of mismatched deps, a common one being Guava. Tool ran on X, code used Y and there was no passing over API boundaries; so no need for the conflict -> thus shading did the trick.
a

ancient-vegetable-10556

09/23/2021, 6:33 PM
@enough-analyst-54434 the good news is that I’m actually looking into fat JARs at the moment
e

enough-analyst-54434

09/23/2021, 6:34 PM
OK. Yeah, we probably can / should re-surrect the java tools wholesale. I think there were just 3, junit runner, fatjar maker, shader.
a

ancient-vegetable-10556

09/23/2021, 6:34 PM
I think our aim was to at least get a tool in place in pants 2, and focus on shading when we see the need
e

enough-analyst-54434

09/23/2021, 6:34 PM
Although , I did see junit 5 fly by, so maybe the junit runner not needed.
a

ancient-vegetable-10556

09/23/2021, 6:35 PM
yeah, we’re using JUnit 5's console runner at the moment in P2, but we obviously haven’t tested that in the extreme yet
e

enough-analyst-54434

09/23/2021, 6:35 PM
IIRC that tool just offered parallelism which we should get from the v2 engine just fine.
a

ancient-vegetable-10556

09/23/2021, 6:36 PM
The Fat JAR maker is definitely something I don’t see a lot of obvious prior art for, but I haven’t been able to find the corresponding hooks into pants itself (mostly due to not knowing how Pants 1 is laid out/github’s code search being useless on code branches)
e

enough-analyst-54434

09/23/2021, 6:36 PM
Yeah, I suspect the shading need will be about a couple months after people actually start using things. That was how long it took to hit classpath conflicts in the past.
Well the hooks probably won't be too relevant since it was just a main to run even back then. So the hooks are the CLI args it supports.
Its out on maven central, so you can run it quick and see. maybe even use as-is?
a

ancient-vegetable-10556

09/23/2021, 6:38 PM
oh huh
e

enough-analyst-54434

09/23/2021, 6:38 PM
Pants dogfooded itself to publish its java tools.
a

ancient-vegetable-10556

09/23/2021, 6:40 PM
I see
org.pantsbuild.jarjar
Oh huh!
well that’s handy
e

enough-analyst-54434

09/23/2021, 6:42 PM
The jarjar will almost certainly not work since it can only handle X bytecode, but worth a spin I guess.
a

ancient-vegetable-10556

09/23/2021, 6:42 PM
X?
w

witty-crayon-22786

09/23/2021, 6:42 PM
also, fwiw: i never saw a side-by-side comparison of the performance difference of the optimization in https://github.com/pantsbuild/pants/blob/e5987065372b1a617fc13ae53b0cab5ef9bbf098/src/java/org/pantsbuild/tools/jar/JarBuilder.java#L1043-L1045 … but apparently
zip
has native support for concatenation:
Copy code
cat input.zip.* > temp.zip
zip -FF temp.zip --out full.zip
…and it would be nice to do something dumb until we know we need the custom code
e

enough-analyst-54434

09/23/2021, 6:42 PM
Can't remember the version
a

ancient-vegetable-10556

09/23/2021, 6:43 PM
oh right
that makes sense
w

witty-crayon-22786

09/23/2021, 6:43 PM
(an issue with the concatenation approach in the medium term is that it wouldn’t support anything more clever than “last file with a particular name wins”… which is generally a fine default)
a

ancient-vegetable-10556

09/23/2021, 6:44 PM
thanks @witty-crayon-22786! Yeah, concatenation is a thing, as long as there’s no class name conflicts in there
e

enough-analyst-54434

09/23/2021, 6:44 PM
Although, to be fair, that was deemed not a fine silent default at Twitter where alot of work went into logging duplicates warnings and etc.
w

witty-crayon-22786

09/23/2021, 6:44 PM
well, even then: last item wins is fine in general. it’s nice to be able to warn/error for it though
a

ancient-vegetable-10556

09/23/2021, 6:45 PM
Right, I suspect that’s what we’ll do until we support shading
e

enough-analyst-54434

09/23/2021, 6:45 PM
We never supported shading for this purpose, fat jars, just for tools as a whole.
w

witty-crayon-22786

09/23/2021, 6:45 PM
but warning/erroring during concatenation doesn’t really make the most sense to me, since the conflict is a potential issue anytime you consume that classpath, not just when building a fat jar.
a

ancient-vegetable-10556

09/23/2021, 6:46 PM
I’ve been looking around the internets to see what the state of fat jar assembly has been, and there’s been a bunch of “make jars that contain jars” suggestions
so seeing that we have an in-house tool seemed handy
e

enough-analyst-54434

09/23/2021, 6:47 PM
If the tools we wrote then don't make sense now, we were deluded then ... ~roughly. So a few hours at least working the olds tools hard seems in order.
w

witty-crayon-22786

09/23/2021, 6:48 PM
@enough-analyst-54434: yea, worth trying it out probably. lots of stuff lands without benchmarks.
i’m not sure whether resuming maintaining our own forks of JVM tools is inevitable, but… it would be great not to if we don’t need to.
e

enough-analyst-54434

09/23/2021, 6:50 PM
We forked 1 jvm tool - jarjar (it acually died). The other 2 were tools that did not exist except buried in other build tools, like maven.
But agreed - clearly we all agree I'd hope for all tools in all languages.
This is the most useful place to look at whys: https://github.com/twitter-archive/commons/blame/4f26f742c997c64758d172aa203873b105d13860/src/java/com/twitter/common/jar/tool/JarBuilder.java That reminded me that CONCAT was a thing for service files, which was common enough. So you can't pick a serivce file, you have to merge them to get all the registered services provided by your N jars for any code using JDK services for plugins.
a

ancient-vegetable-10556

09/23/2021, 7:02 PM
that makes sense
e

enough-analyst-54434

09/23/2021, 7:02 PM
Its massive.
w

witty-crayon-22786

09/23/2021, 7:13 PM
Wow, yea. Comparing to
zip -FF
would be the most interesting bit... because if the -FF pass is mostly copying and just appending a new index, it could be pretty snappy too.
Although how soon we need CONCAT is also a factor.
f

fast-nail-55400

09/23/2021, 7:18 PM
I’m catching up. We should punt on jar shading until some months from now. It doesn’t need to be in v1 of fat jar packaging.
the same with some of the other v1 features: custom manifest files etc. except to the extent needed to be able to use the packaged fat jar
b

bored-art-40741

09/23/2021, 11:39 PM
Hey, we weren't deluded back then! It's not like any of us were crazy enough to write a custom classloader, right?
🎢 1
a

ancient-vegetable-10556

09/24/2021, 4:07 PM
NARRATOR: …………
OK, I was able to make the
jartool
successfully produce a fat JAR on the command line for a project with nontrivial dependencies
w

witty-crayon-22786

09/24/2021, 9:01 PM
nice. if you have time to compare to
zip -FF
, that would be handy. because i can imagine how to cobble this together (even CONCAT) purely with unix tools, and it wouldn’t really be that bad
a

ancient-vegetable-10556

09/24/2021, 9:07 PM
one moment
@witty-crayon-22786 `cat`ting together all of the java files and then running the unix
zip -FF
tool worked just fine. The
jar
that Gradle popped out didn’t include a
main
attribute in the manifest, so the jar was’t runnable, but it was possible to specify the fat jar on the classpath and then invoke the
main
by name
For real-world use, we’d want to test for cases where there are filename collisions
(I remember having
zip
files that span multiple floppy disks, so re-using this functionality is mildly amusing)
w

witty-crayon-22786

09/24/2021, 9:18 PM
Benchmarking them side by side might be good, but might be challenging without nailgun
a

ancient-vegetable-10556

09/24/2021, 9:20 PM
My guess is that
cat; zip
will be faster, if only because it doesn’t recompress — it’s just reading files and outputting a new, correct index at the end of the file
w

witty-crayon-22786

09/24/2021, 9:20 PM
the point of the
jartool
optimization at the head of this thread was to do the exact same thing, i think
a

ancient-vegetable-10556

09/24/2021, 9:21 PM
oh right
So do we have a preference absent benchmarking? The existing
jartool
almost certainly has the better per-file behaviour;
zip -FF
has a quite verbose output which doesn’t seem to detect collisions
w

witty-crayon-22786

09/24/2021, 9:28 PM
um… i think that my hope is that we don’t have to maintain custom JVM code, although as mentioned above, it may be inevitable.
i could imagine landing a first version that used
zip -FF
, with a note on switching to
jartool
if needed… @fast-nail-55400 likely has the best sense of what we need in a first version.
a

ancient-vegetable-10556

09/24/2021, 9:31 PM
reasonable
w

witty-crayon-22786

09/24/2021, 9:32 PM
(but CONCAT support could be done by literally extracting all copies of the colliding file into a new file, and tacking that on in a single-entry zip to the
zip -FF
)
a

ancient-vegetable-10556

09/24/2021, 9:32 PM
that makes sense
as long as we have the zip indices in advance to know where the clashes are (which we can do in Python!)
w

witty-crayon-22786

09/24/2021, 9:34 PM
re: Python: sortof… not without slurping the file into memory. but you could do it in the sandbox with a python process. probably better to do it with
unzip $file $innerfile
a

ancient-vegetable-10556

09/24/2021, 9:35 PM
Python’s
zipfile
module doesn’t seek to the end of the file like a normal zip utility?
w

witty-crayon-22786

09/24/2021, 9:36 PM
`@rule`s don’t have access to loose files on disk… because rather than being loose, they’re in a database.
a

ancient-vegetable-10556

09/24/2021, 9:48 PM
oh sure, and I presume it’s difficult to seek through those files in a random-access fashion?
(zip indices are pretty easy to spot)
w

witty-crayon-22786

09/24/2021, 9:51 PM
sortof? stuff will be sequential in the DB. it’s more that `@rule`s are not intended to directly access large files. you can load files into memory with
DigestContents
, but for large ones you’d want to put them in a sandbox and then run an external process on them instead.
a

ancient-vegetable-10556

09/24/2021, 9:53 PM
OK sure
w

witty-crayon-22786

09/24/2021, 9:53 PM
but there is not an
@rule
API for checking a
Digest
out somewhere on disk, for example… only via a
Process
a

ancient-vegetable-10556

09/24/2021, 9:54 PM
ok