In the repo, we have some tasks where antlr is use...
# general
f
In the repo, we have some tasks where antlr is used to generate Python lexers, from the antlr G4. the process is simply to run the req'd antr JAR with with the jvm. eg.
Copy code
java -Xmx500M -cp "/usr/local/lib/antlr-4.9.3-complete.jar:.:/usr/local/lib/antlr-4.9.3-complete.jar" org.antlr.v4.Tool
At a simple level this is
experimental_shell_command
, am I correct in thinking i can get Pants to "supply" the JVM, and the antlr dependency. adding the jvm backend, gives me the
jvm_artifact(...)
to pull antlr, How do i make the "jvm" a tool / dependency to the
experimental_shell_command
or is there a "java run" type target instead ?
b
I don't have your answer, but your question is well timed! https://github.com/pantsbuild/pants/issues/17405 is currently on our minds
f
Yea, that’s it :-) I’m mocking up right now this part ; expand that issue to ensure it’s not python centric, but can do tool dependencies of Java, go, python, npm, etc
Also it would be wise to drop (alias) the experimental_shell Name also. A suggestion would be shell_run And shell_run_generate — which expects output
b
"Experimental" is our delineator for "likely to change, sooooo"
And the shell part is also a misnomer while we're at it 😛
😄 1
f
hmm, what does a
jvm_artifact
name look like, that I am able to add it to the
shell_command( ..., dependencies["whatgoeshere"], ...)
I have my artifact as
Copy code
jvm_artifact(
    group="org.antlr",
    artifact="antlr-complete",
    version="4.9.3",
)
w
it doesn’t have a guaranteed name until it’s actually been packaged into a particular format
in this case, you’d probably need to define a
deploy_jar
target which depended on the
jvm_artifact
(explicitly), and then have your
experimental_shell_command
depend on the
deploy_jar
then you’d invoke the
deploy_jar
using its name, which is well defined: https://www.pantsbuild.org/docs/reference-deploy_jar#codeoutput_pathcode
java -Xmx500M -jar $my_deploy_jars_name
and yea: as Josh said: “put a JVM in this shell command” is definitely a topic for the linked ticket
f
does pants automatically generate a jvm "default.lock" file _ it does not seem to be, and i cannot find what the format is ..
my pants.toml is thus far
Copy code
[jvm]
jdk = "temurin:1.17"

[jvm.resolves]
jvm-default = "lib/3rdparty/jvm/default.lock"
and now
Copy code
10:11:07.92 [ERROR] 1 Exception encountered:

  KeyError: 'entries'
which is • not a great error mesaage 🙂
w
yes, it does
f
and only came up after i added the path to an empty lock file.
w
rather than adding an empty file, you should use
./pants generate-lockfiles
to generate the file
f
ah
w
and if it is not present, pants should actually print an error suggesting that
b
(yeah we should maybe schema-check dict options)
@witty-crayon-22786 do we have a ticket for that?
w
try removing the file, and see if you get a warning / error?
f
yes it does error when it is not there, but it also did not say what to do about that 🙂
Copy code
10:14:46.00 [ERROR] 1 Exception encountered:

  Exception: Unmatched glob from The resolve `jvm-default` from `[jvm].resolves`: "lib/3rdparty/jvm/default.lock"

Do the file(s) exist? If so, check if the file(s) are in your `.gitignore` or the global `pants_ignore` option, which may result in Pants not being able to see the file(s) even though they exist on disk. Refer to <https://www.pantsbuild.org/v2.14/docs/troubleshooting#pants-cannot-find-a-file-in-your-project>.
w
ah. yea, tricky.
f
generating worked -
./pants generate-lockfiles
w
i’ll file an issue about that: thanks.
👍 1
f
(my goal right now is to get the antlr jar, original, or re-packaged, into the sub-process sandbox of
experimental_shell_command
w
which command did you run to render the “Unmatched glob” error?
it looks like we render a higher level error in some cases, but not the one you encountered
maybe you can add more details here: https://github.com/pantsbuild/pants/issues/17521
👍 1
f
okay - that was unexpected ,,, pants file ...
Copy code
jvm_artifact(
    name="antlr",
    group="org.antlr",
    artifact="antlr4",
    version="4.11.1",
)

deploy_jar(
   name="pkg",
   main="org.antlr.v4.Tool",
   dependencies=[":antlr"],
)

#
# just a shell to see what it going on, inside
#
experimental_shell_command(
     name="asm_parser",
     command="tree",
     outputs=[ "./" ],
     dependencies=[":pkg"], # ":org.antlrantlr-complete"],
     tools=["tree"]
)
Copy code
./pants run path/to/generator 
10:22:03.95 [INFO] Completed: Assemble combined JAR file
ANTLR Parser Generator  Version 4.11.1
 -o ___              specify output directory where all output is generated
 -lib ___            specify location of grammars, tokens files
 -atn                generate rule augmented transition network diagrams
 -encoding ___       specify grammar file encoding; e.g., euc-jp
 -message-format ___ specify output style for messages in antlr, gnu, vs2005
 -long-messages      show exception details when available for errors and warnings
 -listener           generate parse tree listener (default)
 -no-listener        don't generate parse tree listener
 -visitor            generate parse tree visitor
 -no-visitor         don't generate parse tree visitor (default)
 -package ___        specify a package/namespace for the generated code
 -depend             generate file dependencies
 -D<option>=value    set/override a grammar-level option
 -Werror             treat warnings as errors
 -XdbgST             launch StringTemplate visualizer on generated code
 -XdbgSTWait         wait for STViz to close before continuing
 -Xforce-atn         use the ATN simulator for all predictions
 -Xlog               dump lots of logging info to antlr-timestamp.log
 -Xexact-output-dir  all output goes into -o dir regardless of paths/package
I assume that, whilst run did run the jar, that was not running the "experimental_shell_command" - but rather the "deploy_jar"
w
yea… to run the
experimental_shell_command
, you’ll want to pass its name:
./pants run path/to/generator:asm_parser
i am… not entirely sure why
pkg
ended up selected there. @bitter-ability-32190 might know better.
b
Is deploy jar runnable? 🤔
w
yes
b
I suspect then this is in a directory named pkg 🧐
w
ah, yea, maybe… i guess i didn’t think about the command maybe being psuedocode
@fresh-continent-76371: is the directory named
pkg
?
pants will default to selecting the target with the same name as the directory, which might explain this
f
no, but (and i recall this)
experimental_shell_command
is not runnable anyway . looks like it just picked the "first" off the list of possibles.
Copy code
❯ ./pants run  apps/generator:asm_parser
10:33:45.90 [ERROR] 1 Exception encountered:

  NoApplicableTargetsException: No applicable files or targets matched. The `run` goal works with these target types:

  * deploy_jar
  * docker_image
  * experimental_run_shell_command
  * pex_binary
  * python_source
  * python_test

However, you only specified target arguments with these target types:

  * experimental_shell_command

Please specify relevant file and/or target arguments. Run `./pants --filter-target-type=deploy_jar,docker_image,experimental_run_shell_command,pex_binary,python_source,python_test list ::` to find all applicable targets in your project, or run `./pants --filter-target-type=deploy_jar,docker_image,experimental_run_shell_command,pex_binary,python_source,python_test filedeps ::` to find all applicable files.
w
ahh, got it. yea, you’ll probably want to define a
shunit_test
that depends on the
experimental_shell_command
to test its outputs
f
hmm now I am confused 🙂 seems
package
is not for
experimental_shell_command
b
FWIW I use an `experimental_run_shell_,command`for easy sandbox shenanigans
w
experimental_shell_command
just produces loose files, which aren’t really packaged in any particular way… if you want to emit files as an
archive
, you could add an
archive
target that depends on your
experimental_shell_command
, and then
./pants package $my-archive
would put them under
$buildroot/dist
or a
shunit_test
for
./pants test
, etc
or a
python_test
, for that matter
h
As an aside, this takes me back! ANTLR support was the very first thing I added to Pants v1, 12 years ago, and is how I got involved in all this in the first place! cc @enough-analyst-54434
e
Heh. Indeed!
I just made the Terence Parr sign at you!
f
🙂
am i correct in understanding that the
experimental_run_command
cannot by "executed" directly from the command line, by a goal
e
I think that's right. For debugging in general though, you can always add
--keep-sandboxes=always
. You should get Pants log lines noting the various sandbox directories left laying around. You can CD into the one you care about (hopefully which one is easy enough to figure out) and you can look at
__run.sh
to see what Pants runs and play around with things.
f
This would be a good addition to the documentation, for every "target" method., have a section that describes under which phase (or none) it is expected to be operated with.
(phase == goal)
Okay - part way. so i was able to use experimental_shell_command to run antlr. but as you will see it is not perfect. And I can see it may be better to write a dedicated AntlrPlugin 1. it would be good if experimental_shell_command could just dump the files in dist. They are ready to be consumed 🙂 2. having to use an intermediate "deploy_jar" after jvm_artifact could be improved/ 3. i can see with maybe some massaging I can take the output from
experimental_shell_command
and feed it to
python_distribution
- which then means maybe I should just write a codegen antlr plugin 4. It would be good to be given variables to the dependencies (path or file), by name, 5. also a variable of the "ROOT" of the sanbox, It is brittle to be depending on the path (depth) and full paths to dependencies generated.
Copy code
jvm_artifact(
    name="antlr-dependency",
    group="org.antlr",
    artifact="antlr4",
    version="4.11.1",
)

deploy_jar(
   name="antlr-complete",
   main="org.antlr.v4.Tool",
   dependencies=[":antlr-dependency"],
)


resource(
    name="my_antlr_grammar",
    source="folder/my_grammar.g4"
)

experimental_shell_command(
     name="asm_parser",
     command="""
         mkdir out && \
         java -jar $(find ../../../../ -name antlr-complete.jar ) \
                -o out/ \
                -visitor \
                -Dlanguage=Python3 folder/my_grammar.g4
         
     """,
     outputs=[ "out/" ],
     dependencies=[":antlr-complete", ":qps_asm_antlr_grammar"],
     tools=["tree", "java", "find"],
)


archive(
    name="qps_asm_lexer",
    files=[":asm_parser"],
    format="zip"
)
g
You can have them dumped to dist. All you need to do is enable a codegen backend - for some reason,
pants.backend.docker
is the one I end up using. That gives you
export-codegen
, which does that. I'm using that for my codegen-adventures. 🙂 I.,e in your case you'd
pants export-codegen //some/path:asm-parser
and you can skip the archive step if you don't need it.
Unfortunately that still doesn't put them into the workspace proper (dist is ephemeral in my view) but I believe wrapping it all up in
experimental_run_shell_command
will work for that - my plan is to use that to generate goldens for CI but using the shell command output directly in all build steps.
h
Thanks for this very helpful feedback! We (by which I mean @ancient-vegetable-10556 mostly) are looking at improvements to the shell-command functionality at the moment, so this is super useful.
For example, having a more straightforward ability to push stuff out to
dist
sounds necessary? Could be
export
, or
package
, both of which do that today in their respective areas
a
At the very least, it seems like
experimental_shell_command
and friends would benefit from having a way to export files
at the moment, the “obvious” way to do it is to create an
experimental_run_shell_command
with a command of
cp {chroot]/artifact.txt *PATH_IN_WORKSPACE*
and dependencies of the step that you want to run ion the sandbox
f
at the moment, the “obvious” way to do it is to create an
experimental_run_shell_command
with a command of
cp {chroot]/artifact.txt *PATH_IN_WORKSPACE*
and dependencies of the step that you want to run ion the sandbox
so to make that work the best, injecting the
PATH_IN_WORKSPACE
value to the
...shell...
is the need, as this full path varies across CI, and user workstations.
which leads to a question: are their default variables available in the build context at all ?, i.e. ROOT_OF_REPO
what I am attempting is to provide the full path of the repo. injecting into the
experimental_shell_command
like
Copy code
experimental_shell_command(
     ...
     extra_env_vars = [ "BUILDROOT=" + GetBuildRoot(), "BUILD_FILE_PATH=" + GetBuildFilePath() ],
     ...
)
a
Generally speaking, we don’t like to expose the actual filesystem to processes that are managed by Pants, because that makes processes less likely to be cacheable
Generally speaking, you want to explicitly provide the files as
dependencies
, so that they get copied into the sandbox.
f
This looks good and helpful
a
Great! Let me know if you get the opportunity to try it out soon 🙂