https://pantsbuild.org/ logo
h

high-magician-46188

03/29/2023, 9:09 AM
Hi, I have a Python test that uses Spark, which depends on having a JVM. Is there a way to set a dependency on a JVM with a specific version? (for instance,
openjdk 11
)
🧵
r

refined-addition-53644

03/29/2023, 9:23 AM
h

high-magician-46188

03/29/2023, 9:43 AM
Thanks, will give it a go.
I need it to be able to install a JVM, cache it, and package it when sending it for remote-execution. Will these be possible with this feature?
That is, I want it to have a similar handling to other Python dependencies.
r

refined-addition-53644

03/29/2023, 9:47 AM
I haven’t used it myself. I mostly saw this being recommended here for similar scenarios. This is still in beta.
b

bitter-ability-32190

03/29/2023, 11:52 AM
Yeah if everything gets structured correctly, the output of your tool with be cached. And we're cranking performance of that caching in 2.17.x
h

high-magician-46188

03/29/2023, 12:06 PM
👍 Thanks for confirming.
b

bitter-ability-32190

03/29/2023, 1:32 PM
Don't take out word for it though, try it out! We have spark code in our codebase and right now I just turn a blind eye, so I'd love to hear how this works for you
h

high-magician-46188

03/29/2023, 2:19 PM
Ack. Will probably take a few days before I have a result though.
b

bitter-ability-32190

03/29/2023, 2:19 PM
This has been silently broken in our repo for months. Take your time 😉
h

happy-kitchen-89482

03/29/2023, 5:06 PM
Someone with more JVM knowledge than me should chime in (@witty-crayon-22786, @ancient-vegetable-10556) but I don't think adhoc_tool is pertinent. I'm pretty sure the JVM backends already support this?
a

ancient-vegetable-10556

03/29/2023, 5:06 PM
Let me take a loooksie
There isn’t a very good way to make JVMs present in Python tests. This is a good use case for making runnable dependencies a universal thing
h

happy-kitchen-89482

03/29/2023, 5:08 PM
Oh, NM, I missed that this was a python test, my bad
a

ancient-vegetable-10556

03/29/2023, 5:10 PM
The best thing you could do is write an
experimental_test_shell_command
that executes the
pytest
runner and the relevant JVM dependencies as a
runnable_dependency
, but it’s not a workflow I have an example of off the top of my head, and it’s likely to be exceeeeeeeeeedingly awkward.
h

high-magician-46188

03/29/2023, 5:35 PM
I see. Well, I'll think of it for a bit. I imagine maybe making a new
jvm
target that the
python_tests
and/or
python_sources
could depend on and thus to have a JVM available in the relevant contexts (testing, maybe somewhere else?). Any leads on that? (or maybe save me some time by telling me it's not such a great path to go through? 😛)
a

ancient-vegetable-10556

03/29/2023, 5:42 PM
The infrastructure is all there in the form of
runnable_dependencies
, in
adhoc_tool
and
shell_command
, but it’s limited to working with those targets at the moment. We need to work on a more systematic way to add
runnable_dependencies
to rely on
b

bitter-ability-32190

03/29/2023, 5:44 PM
One odd thing here too is that in order for this to work,
pyspark
needs to find Java. Meaning the target that represents "Java" should carry not just the files on disk, but an env var (
JAVA_HOME
) associated. I have other (non-Java) use-cases for having the output something also carry env vars for similar reasons. So something we need to figure out is how to make a target not just a digest, but a digest and env vars 😅
a

ancient-vegetable-10556

03/29/2023, 5:45 PM
That’s
runnable_dependencies
🙂
b

bitter-ability-32190

03/29/2023, 5:45 PM
I don't follow (also sorry @high-magician-46188 we might be derailing a bit)
a

ancient-vegetable-10556

03/29/2023, 5:46 PM
@bitter-ability-32190 moving your tangent to #development
1