Hey! Nailgun can break hermeticity for scala build...
# general
f
Hey! Nailgun can break hermeticity for scala builds. My case was scala2 macro expansion resulting in output bytecode being undeterministic on nailgun, but fine with it disabled. Before diving deeper, given the discontinuation of nailgun should we consider an alternative?
Here's also the repro for that: https://github.com/jgranstrom/pants-macro-h-repro
Copy code
# With nailgun

> pants --no-local-cache --no-pantsd package jvm:jar
> mv dist/jvm/jar.jar ng1.jar
> pants --no-local-cache --no-pantsd package jvm:jar
> mv dist/jvm/jar.jar ng2.jar
> pants --no-local-cache --no-pantsd package jvm:jar
> mv dist/jvm/jar.jar ng3.jar

> sha1sum ng1.jar
20d4e1f6dbe4a1b1b1527029e53b58a82f54e81e  ng1.jar
> sha1sum ng2.jar
b66cae631870a207960195f149e47a663e321f7b  ng2.jar
> sha1sum ng3.jar
8ebdcb10f416497afa17feb473ea929ea91a792f  ng3.jar

# Without nailgun

> pants --no-process-execution-local-enable-nailgun --no-local-cache --no-pantsd package jvm:jar
> mv dist/jvm/jar.jar 0ng1.jar
> pants --no-process-execution-local-enable-nailgun --no-local-cache --no-pantsd package jvm:jar
> mv dist/jvm/jar.jar 0ng2jar
> pants --no-process-execution-local-enable-nailgun --no-local-cache --no-pantsd package jvm:jar
> mv dist/jvm/jar.jar 0ng3.jar

> sha1sum 0ng1.jar
d482f0881e1a2d4d11614310c057175eec42e3a9  0ng1.jar
> sha1sum 0ng2.jar
d482f0881e1a2d4d11614310c057175eec42e3a9  0ng2.jar
> sha1sum 0ng3.jar
d482f0881e1a2d4d11614310c057175eec42e3a9  0ng3.jar
Does anyone know Nailgun to be iffy with scala-macros in any way? Not sure why it's doing that, but it is for sure doing it.
h
Hmm, I haven't been following the archiving of nailgun, but that is unfortunate. What are the alternatives? What is Buck 2 using?
f
Hmm, for buck2 probably nothing, and actually looks like overall no immediate alternatives when I'm actually looking at it
I think people rather try to get remote cache/execute on point and just skipping layers like nailgun
h
Yeah, that is one way I suppose
But that doesn't solve the cold JVM problem
Those remote JVMs will all be cold, presumably
f
Yeah not actually sure what they run for that now
Would be worth looking up I guess
The biggest problem for us is that they just end up having different digests, ending up causing images to have different digests. ending up redeploying things that have not changed, etc. I worked around it in CI by just disabling nailgun for packaging and publishing. But it seems like it's an area to dig into a bit more 😄
h
To answer your earlier question, I don't think we'll be replacing nailgun any time soon, so it may be worth digging deeper to see why Scala macros are impacting things
To clarify, which nailgunned process do you suspect is the issue? the scalac run, presumably?
f
Yeah I've been digging into it quite a bit. Everything is compiled in the same order, and classpaths are the same. I have looked at the bytecode where it differs in the order of macro declarations. I'm not sure why nailgun is causing it though, but it's definitely at the compile phase and it's 100% because of scala2 macros, in my case at least. The repro I sent is quite a minimal example of a larger macro ending up in that state, Where the actual bytecode is inconsistent. So I'm wondering if nailgun it doing something causing the compilation-order to become inconsistent, maybe because of some cache or something, because just doing scalac without nailgun is fine, so it shouldn't really be an issue
h
What happens if you reduce concurrency down to 1 so that only one nailgun can run at a time?
wondering if this also requires a race condition
f
Yeah I think that is a factor, I created the A-F.scala files specifically to exercise that, and it seemed like that is involved for sure. 1-2 files were fine when I tried it, but like 5-10 files and it's always an issue
> What happens if you reduce concurrency down to 1 so that only one nailgun can run at a time? I tried this for a bit but honestly I wasn't sure how to make sure it's actually doing what it’s supposed to do, and the parameters I found that seemed interesting had minimum 2 🤷‍♂️
And just following the debug logs it actually compiles each file sequentially either way, and the final compile of the top-level file has everything equal when checking the pants sandbox, classpath order etc. The only things seemingly breaking the digest is nailgun, or maybe there's a bug somewhere where nailgun output is consumed 🤷‍♂️