I'm looking for a way to run a python script befor...
# general
a
I'm looking for a way to run a python script before packaging a pex_binary. I'd like the script to create a file, that would then be packaged as a resource into the pex. Is writing it as
experimental_shell_command
that executes
python my_script.py
the way to go? What do I do afterwards to access the file?
We're using HTTPSource for one of our dependencies, but here it's unfortunately not available through http
b
I would absolutely love to say yes. But you've hit the brick wall of Pants assumptions 😞
experimental_shell_command
does generate files that can be consumed. BUT it literally generates a
files
target type, which cannot be treated like a
resource
and therefore cannot be packaged in a
pex_binary
Your only solution today is to write your own in-repo codegen plugin that is almost 1:1 the same as
experimental_shell_command
, but the
output
classvar in the
GenerateSourcesRequest
is a
resource
. ...in fact I think you can have it be 1 statement long:
Copy code
class GenResourceTarget(ShellCommandTarget):
    alias = "generate_resource`

class GenResourceRequest(GenerateSourcesRequest):
    input = ShellCommandSourcesField
    output = ResourceSourceField

@rule
async def do_the_thing(request: MyRequestType) -> GeneratedSources:
    return await Get(GeneratedSources, GenerateFilesFromShellCommandRequest, request)

def rules():
    return [*collect_rules(), UnionRule(GenerateSourcesRequest, GenResourceRequest),]
(didn't test the above)
I'm hopign to fix this in the future. It's my next "big" feature I plan to investiagte/tackle
❤️ 1
a
Oh I see. Well, thank you for your suggestion. I'll see if it works for our use case. One last question. If I understand correctly, if we just built a docker_image or an archive and wrote our code around it, we should be able to pull this off with what I wrote before right? It's not perfect, as while developing we often run without building images, but it could work for now
b
Ah well if you want the file in the
pex_binary
you're SoL. If you want the file in an
archive
or in a
docker_image
(I think) that's actually doable, since those are
file
-aware
a
Ok, thanks 🙂
b
This line tells me the output of an
experimental_shell_command
WILL be in the Docker build context
Oh I forgot to say. One unfortunate side-effect of Pants here as well: Any
dependencies
for your "generator" target will also be dependencies of your thing-that-depends-on-it 😐
OK once we have https://github.com/pantsbuild/pants/pull/17277 and https://github.com/pantsbuild/pants/discussions/17336 I'll have 2 blog posts to write (one about those + @curved-television-6568’s
pipx
-like tip. Then one about Python GPU support in Pants)
🎉 2
h
I’m looking into
experimental_shell_command
, and @bitter-ability-32190 has long had thoughts about the mess of files vs resources
✅ 1
Sigh, why did we decide that
files
can’t be embedded in a pex? I get that it’s troublesome in theory, but in practice…
b
@witty-crayon-22786 would say that
files
are different form resources.
files
are expected to be loaded from disk and not embedded in packages.
pex_binary
being like a compiled JAR or a native EXE (with embedded bytes)
w
if PEX had support for automatically extracting
files
in the CWD it is running in so that you could use
open
with a relative path, then we could support using
files
with PEX… but it doesn’t right now (and it would be a bit awkward: you’d have to be careful where you ran the PEX not to overwrite stuff in the directory)
meanwhile,
resources
are always in the PYTHONPATH, and are never (assumed to be) relative to the cwd.
to be clear though: how you declare that you want to consume something as
resources
or
files
could/should change over time (i.e. to be consuming-target-side rather than producing-target-side). i’m just fairly sure we do actually need the split.
h
Sure, but users have expectations about this working, and it doesn’t. In practice, if your source root is the repo root, there is no difference, and it looks like Pants is being obstreperous…
b
Yeah I don't disagree with Stu (I don't personally use the delineation, but I can see a universe where it matters 😛 ) but how we have it today is not ideal
w
In practice, if your source root is the repo root, there is no difference, and it looks like Pants is being obstreperous…
that’s not true once you’ve actually deployed a PEX somewhere. say you run it in your home dir… should it extract the
file
into your home dir automatically? should it overwrite whatever is already there?
h
But if the distinction is on the consumer side, that starts to make more sense (e.g., “consume these files as resources relative to this source root”)
If the pex has unzipped itself then the file is potentially available via relative path from a caller’s
__file__
, why would it need to be extracted to your homedir?
w
that is different from “it’s in your source directory / repo root”, and would be different from how
files
work in test running, with archives, etc.: in those cases, the file is relative to the CWD
☝️ 1