For buildable targets like `pex_binary` , are ther...
# general
n
For buildable targets like
pex_binary
, are there any options now or plans in the future to support pre/post-packaging script injection? Example: a project contains some source code that an npm packaged tool (e.g.,
gulp
) uses to generate html/css/js assets for a Python webserver that's packaged into a
pex_binary
with the built assets bundled as a
resource
. It would be nice to still be able to do
./pants package webserver
instead of A)
./build-support/prepackage_prepare_script.sh && ./pants package webserver && ./build-support/postpackage_cleanup_script.sh
or B) build assets continuously and check into source control. Presumably the cleanup step can be eliminated if Pants does all the packaging in a tmp folder.
b
I would suggest
experimental_shell_command
https://www.pantsbuild.org/v2.13/docs/reference-experimental_shell_command except it generates `file`s under-the-hood which don't get packaged 😭 It's the main reason I've worked on a proposal to redesign how Pants understands dependencies.
I'll say there's always the (albeit burdensome) escape hatch of plugins helping you do whatever fun things you want. In this case a plugin that is a copy of
experimental_shell_command
but generates resources might be the ticket
n
Would this still support the idiomatic usage of
./pants package,run,test,etc. webserver
or would still need to run the
experimental_shell_command
first before any of these goals?
b
Pants (more-or-less) knows how to consume a targets dependencies. In this case, if your
shell_generating_resources
target was depended on by your
pex_binary
it'd package the resources (or pull from the cache if its fresh) before building the PEX
n
Oh, the
experimental_shell_command
can be depended on by
pex_binary
and it would be ran first for its side-effects. But this would always be ran before
run, package, etc.
I assume? What about cleanup after (or just dump all the built resources into
/var/tmp/...
?)
b
But this would always be ran before
run, package, etc.
I assume?
It'd only be run if the thing you're trying to
package
or
run
depends on the outputs AND the cache isn't valid given the inputs
What about cleanup after
Pants codegenerates files into the cache, not into your repo. No cleanup needed 😉
But again, you'd have to dip into an in-repo Pants plugin to do this currently. https://www.pantsbuild.org/v2.13/docs/plugins-overview is a start to that 🙂
n
So basically you're saying that currently there is no way to express that
pex_binary
depends on what files
experimental_shell_command
dumps into its
outputs
path, so would need a new plugin?
b
You totally can do that, the dirty detail is the
outputs
are treated like the
files
target, and Pants doesn't include
files
in packaged artifacts (except
archive
) 😞
This dirty detail is the start of my crusade against having both
file
and
resource
😂
n
.<;;
b
It's workflows like yours that really drive me to ensure we get this right going forward 🙂
n
Is the only difference between
file
and
resource
to force you to distinguish between which API you use (e.g.,
open
vs.
pkgutil.get_data
?)
b
The true difference is quite nuanced and unfortunately doesn't map well to Python 🥴
n
Because to be honest, most of the time I don't bother with
pkgutil
even when using a
resource
, and rather do something like
Path(__file__).resolve().parent
instead.
b
Same 🙂 Precisely why the nuance doesn't map well to Python. If you consider a PEP 441 package, then it makes more sense in Python, but those are rare because nowadays they don't hold up well to several common Python packaging patterns
n
How does a global option to override file packaging in
pex_binary
sound (basically acknowledging you understand the disclaimer about how file APIs in Python work)? For the plugin route -- without first reading your instruction link -- would we be adding a Python module to say
./admin/build-support/pants/plugins/experimental_shell_command2_rc.py
and then registering it in
pants.toml
or some place? And said Python module, that could be implemented easily w/ inheritence of the existing target I hope or require a wholesale copy/paste/edit?
b
If we were to unlock this workflow natively, I'd prefer if we just fixed it the right way by shifting the paradigm (in this case,
file
and
resource
get elided into
asset
, and how it is consumed is shifted to the consuming target, via
file_deps
or
resource_deps
)
For the plugin route ...
Yeah, that's it in a nutshell. There's a handful of plumbing work to get things stood up, but once all cylinders are firing you'll be blasting away. It's worth mentioning once you get cozy with the plugin API and Pants internals, you can do some really fun things 🙂 At some point we had like 12 plugins in-repo (I've been slowly upstreaming them)
or require a wholesale copy/paste/edit?
Unfortunately this 😞
n
Are there any examples of those plugins that implement all the pipes left?
b
I'm sure there are, but I don't know where they live 😛 My goggles are too Pants-tinted so I just grok the source code 🌹 🥽
n
Can you point me to where the rules for
ShellCommandTarget
are? I can't find them by searching (https://github.com/pantsbuild/pants/search?q=ShellCommandTarget&amp;type=code). I was able to locate the target class definition, however.
If you're this far, I'm impressed 😉
n
Maybe I was just confused because I expected this line to import
ShellCommandTarget
:
from pants.backend.shell.target_types import (    ShellCommandCommandField,    ShellCommandLogOutputField,    ShellCommandOutputsField,    ShellCommandRunWorkdirField,    ShellCommandSourcesField,    ShellCommandTimeoutField,    ShellCommandToolsField,)
But will keep reading
b
in the middle between the rule and the target is a "field set".
So any target that has instances of these fields is considered
I'll need to open this in an IDE probably to really get to the root where it returns a File instead of a Resource
b
Also oops these are all links to my fork of Pants 🙈
this line is the the only thing needing change in your copy to make it working (other than re-doing all the "new target and rules" plumbing) https://github.com/thejcannon/pants/blob/f7df0ba30a44a308c200de8f6e241cf3db4af209/src/python/pants/backend/shell/shell_command.py#L67
n
Oh, didn't see that configuration. Makes it easy then.
Mb a dumb question, but does Pants need to be installed in any virtual environment or otherwise importable for a first-party code plugin to work? Or will the installed version of Pants deal with that?
b
🪄
🙌 1
The version of Pants running will run with your in-repo plugins as part of its PYTHONPATH so long as you configured it
Whcih you have to to install
n
Ok, so I am creating a new backend package
shellrc
with
shellrc.register
and
shellrc.shell_command_rc
modules. In the latter, I'm defining a new target
ShellCommandRCTarget
and a rule
run_shell_rc_command
that accepts a new
GeneratedResourcesFromShellRCCommandRequest
subclass of
GenerateSourcesRequest
. Besides those changes, can I just import (reuse) the dependencies from
pants.backends.shell
or do I need to duplicate those too?
b
My mental stack overflowed. Try It And Find Out
n
I guess my main question is: Can I reuse classes defined in the existing shell backend in my new backend so I only have to copy over the absolute minimum to the new backend? E.g., can my new
myplugins.shellrc.__init__.ShellCommandRCTarget
re-use
pants.backend.shell.target_types.ShellCommandSourcesField
in its
core_fields
attribute?
b
In general, yeah go hog wild. In this case I think so, but don't hold me to that 😛
n
So I was able to get it to work. I verified
experimental_shell_command
outputs do not get copied into the pex, but that enabling my new
plugins.shellrc
backend and using the new
shell_command_rc
target does! (Although somewhat confusingly when the new backend is enabled,
experimental_shell_command
is inherting the new behavior -- I am sure this has to do w/ me reusing components of the shipped backend and masking behavior). Two questions: 1. I am getting bizzare
npm
errors trying to find
node_modules
in a place it does not exist - it looks like it has something to do w/ the environment sandboxing -- can you point me to where I can just eliminate any environment scrubbing/tool symlinking? I just want to inherit the user's shell when running the script (for now) -- I think this is what
experimental_run_shell_command
does? 2. The outputs directory is the full path relative to the
pants.toml
root, which does not work well with the stripped source root of the Python code that depends on the generated resources. So the generated assets might get put in
./proj/app/myproject/scripts/build-support/assets
, but the code that depends on them is in
./myprojec/webserver.py
because
./proj/app/myproject/src
is a source root containing package
myproject
. Is there any relocation mechanism available so something like
f"{Path(__file__).resolve().parent[n]}/proj/app/myproject/scripts/build-support/assets"
isn't needed? I would want the files copied to
./proj/app/myproject/src/myproject/assets
so we'd use
f"{Path(__file__).resolve().parent}/assets"
Was able to deal w/ #1. I think #2 is going to be hard though.
b
Lol for #2, just make a
relocated_resources
target plugin? Lolol so sorry for the mess (look up
relocated_files
)
n
Can you point me to where the engine handles a dependency on a relocated file? I can only find the target definitions in
pants.core.target_types
, but can't find anything about operations on them by searching the repo for the target or any of its fields/aliases. At this point I have added a RelocatedResource version of RelocatedFile target/fields to the new backend, but not clear how that would automatically result in the engine relocating resources.
b
so
relocated_files
is just a file codegenerator which generates the files at a different location. Don't forget the `rule`s operate on fields, so grepping for `RelocatedFilesSourcesField`leads to
RelocateFilesViaCodegenRequest
leads to
relocate_files
n
Oh, so that private field is actually needed then.
b
private field?
n
RelocatedFilesSourcesField
(
alias="_sources"
)
b
Oh yeah, another dirty detail is a lot of things operate on "sources" but here there isn't one. A leading underscore means the field won't be documented (because it shouldn't be)
n
And then for whatever reason I just ignored the codegen rule
🤷‍♂️ 1
Ok, done, works. Curious about one thing though. The relocation of resources seems to completely bypass the source roots stripping at the
src
stage (so I have to strip a prefix relative to the SCM root) and not at the
dest
stage (i.e., I have to replace the stripped prefix with something that takes into account source root stripping). I expect that is what we want and expect, but not sure if I am overriding any fancy relocation logic that applies only to resources.
So in conclusion, created a new backend
shellrc
that exposes
relocated_resources
and
shell_command_rc
, allowing you to generate assets at test/package/run/etc. time from a build script. Main benefits are 1) Same idiomatic usage of pants goals locally and in CI and 2) in asset-heavy builds, no need to continually update and check into source control or sandwich pants commands between build/cleanup scripts and 3) Helps provide a glue for builds that are predominately not Python, but for which you still want the illusion of the benefits of using Pant's Python backend (e.g., fatdist a nodejs app as a pex_binary w/ an npm install/setup script) 🙂 That was a fun intro to plugins. Since this is just a very natural extension of the existing
pants.backend.shell
and
pants.core.target_types
modules, might be better to just submit a PR extending those?