Hey all - I have a python binary that shells out ...
# general
l
Hey all - I have a python binary that shells out to another first-party python binary. Say these are in the same directory in files "bin1.py" and "bin2.py". (Sadly, this has to shell out, because the second binary uses a framework - metaflow - that expects a python file defining a "flow" to be run as a process, rather than included as a module.) Is there a way to do this conveniently with some combination of python_sources and pex_binary targets? A couple things I have found that do work: 1. Within bin1.py, call
subprocess.run("pants run bin2.py".split())
. 2. The other thing that (seems to?) work, is defining a
pex_binary
that depends on both files, then within bin1.py, call
subprocess.run(f"{os.getenv('PEX')} -m bin2".split())
. (Honestly I think it's already pretty awesome that this seems to work.) The problem with (1) is that it doesn't actually work without setting
PANTS_CONCURRENT=True
in the subprocess' env, which is fine, but slows things down a lot. (It also "feels wrong", since the two binaries should be built together.) I think (2) is pretty nice, except for one caveat: I can't set an entry_point on the pex_binary, which means that if it is not directly in a source root, it (I think?) has to be run like this:
pants run :bin-pex -m project.module.path.bin1
or
./bin-pex.pex -m project.module.path.bin1
. This isn't bad at all, but it is inconvenient enough to make me curious whether there are better options that I haven't considered yet.
1
c
I couldn't find a way with pants to package a
pex_binary
and include it as a dependency. I thought that
experimental_wrap_as_resources
would help, but that won't package a target. (It would wrap source files or the output of a
shell_command
) You can, however, do option 2️⃣ better: if you use
sys.executable
instead of
os.getenv("PEX")
, you'll get the path to the unzipped and running python interpreter instead of the PEX itself. This means you can specify an entrypoint to the
pex_binary
. I think that makes it a pretty good option (also consider using
shlex.split
instead of
split
or building an explicit list, it'll save debugging at some point)
l
I actually tried a variant of that
sys.executable
approach (using
sys.argv[0]
), but it just resolves to
/opt/homebrew/opt/python@3.10/bin/python3.10
(ie. my system python), and nothing from the pex virtual environment can be found.
(thanks for the tip on shlex.split by the way!)
c
oh! I forgot to mention setting the
execution_mode
Copy code
pex_binary(
    ...
    execution_mode="venv",
)
l
🙌
That's the ticket! Amazing, thank you so much @careful-address-89803!
c
glad I could help!
m
For anyone who finds this thread, I wrote a macro for approach 1:
Copy code
def pex_resource(
  name: str = None,
  pex: str = None,
  package: str = None,
  tags: list[str] = None,
  description: str = None,
):
  if package is None:
    # Package paths must match the target path.
    directory = full_target_for_address(pex).split(':')[0]
    package = directory.removeprefix('//').replace('/', '.')

  output = leaf_name_for_address(pex) + '.pex'

  shell_command(
    name=f'{name}_shell',
    command=f"""
      if [ -e {{chroot}}/{package}/{output} ]; then 
        mv {{chroot}}/{package}/{output} {output}
      else 
        echo "File not found at {{chroot}}/{package}/{output}"
        echo "Generated contents:"
        find {{chroot}}
        exit 1
      fi
    """,
    workdir='.',
    tools=['echo', 'find', 'mv'],
    output_directories=[output],
    execution_dependencies=[pex],
    outputs_match_mode='all',
    tags=tags,
  )
  experimental_wrap_as_resources(
    name=name,
    inputs=[f':{name}_shell'],
    tags=tags,
    description=description,
  )
Here's a python helper for getting the executable:
Copy code
import pathlib
from importlib import resources


def entrypoint(pants_address: str) -> pathlib.Path:
  """Returns the entrypoint for the pex_resource target."""
  prefix, pex_name = pants_address.split(':')
  package = prefix.removeprefix('//').replace('/', '.')
  pex_path = pathlib.Path(
    str(resources.files(package).joinpath(pex_name + '.pex'))
  ).resolve()

  if pex_path.is_file():
    return pex_path
  pex_path = pex_path / '__main__.py'
  if pex_path.is_file():
    return pex_path
  raise FileNotFoundError(f'PEX file not found for {pants_address}')