I'm working on alternative plugin for deploy_jar, ...
# general
m
I'm working on alternative plugin for deploy_jar, where we merge all first party jars into one jar (and strip timestamp, for reproducibility) and leave the third party as is. How can we tell if
ClasspathEntry
is third party or first? I thought about adding a
third_party
flag to
ClasspathEntry
but it seems that a
ClasspathEntry
can actually hold both first party and third party jars. One alternative is to use some kind of regex to recognize coursier jars, but I'm not a big fan of it. Few reasons for the plugin: 1. Merging all jars can take time 2. we want reproducible jars (no timestamps or metadata), coursier is already reproducible, so we only need to strip first party jars. 3. From our experience, merging jars (fat jar) can actually override resources in un-determinstic way. 4. We would later pack everything with Kaniko and only push the image to registry if digest is different.
h
cc @ancient-vegetable-10556
a
Hi @mysterious-waiter-14207! Let me spend a few minutes re-acquainting myself with the code. One complicating factor is that you can merge two classpath entries, yielding a single classpath entry. I’ll need to remind myself of how we use that
OK. Right now, there’s no way to detect first-party vs third-party entries at consumption time (i.e. when you
await Get(Classpath, …)
). I think the best way to fix it would be to add a flag on
ClasspathEntry
that indicates whether each entry came from a compile process or from Coursier. At that point,
ClasspathEntry
should do the right thing —
.args()
already only provides arguments for the entry’s owned files, not the dependencies.
m
Cool, I think I can do it. Should it be called
isThirdParty
?
a
My initial thought was to use an enum, in case we want more options down the line, but a boolean is probably an OK way to start
If other maintainers disagree we can always make it more complicated later
m
cool, I will make the initial PR
@ancient-vegetable-10556 something like this:
Copy code
class ClasspathEntryType(Enum):
    COMPILED = auto()
    THIRD_PARTY = auto()
?
a
That would do nicely!
h
I like this direction. One tiny naming nit: "3rd party" is a term we use ~informally as a synonym for "resolved as a binary dependency", but in this case I think it makes sense to be precise and call this "RESOLVED"? Since it's not the first- vs third-party provenance that matters here, but how the jar was produced, right?
m
sounds good to me
should I add an optional filter to
args
and
digest
methods to take a filter by the type? None is the default would not filter?
a
Hrrrrrm
A better option might be to add a
filter
method to
Classpath
m
cool, that will return another Classpath, right?
a
Right.
At that point, the code in
package_deploy_jar
is 98% of the way to doing what you want
m
Yes, I actually have the code for my plugin ready, at the moment I'm stripping all jars (including third party) and not merging the first party jars.
it is mostly a copy of deploy_jar
a
OK
m
I would send it as a different PR, in case it is beneficial for others.
for the merge method, I'm taking the classpath_entry as parameter, but this assumes no-one will merge RESOLVED with COMPILED.
a
I’m not certain you’d need to change the merge methods in any way
m
I do at the moment, it doesn't compile, because the ClasspathEntry now takes parameter for the type
I will have the PR ready as draft in 10 minutes
a
ok
m
draft here: https://github.com/pantsbuild/pants/pull/16129 still without the filter method. suggestion on how to implement merge would be great.
a
OK, I should have some time to look later today
m
thanks