Hey Pants people, What is the best way to run mod...
# general
s
Hey Pants people, What is the best way to run module level scripts using the correct Pex environment? For example, I’m trying to use Mosaic’s composer tool, and to use the tool you install the mosaicml module and then run
composer training_script.py
In Pants is there a way to inject a Python environment into a command like this? For example, perhaps I create a shell script that runs this command then Pants creates the Pex environment somehow?
b
Are you trying to run this manually (eg
pants run something
) or automatically as part of the build process?
s
Manually but eventually I’d like to run it as part of a remote job
b
I think
pex_binary
supports a few different ways to specify what gets run, and that’ll construct the Python environment (I’m not in a position to dig up the relevant docs page); does that help? In future (pants 2.16), the new
adhoc_tool
might be just the thing to run it automatically, even without a
pex_binary
. Either way, you’ll need to add it as a dependency (maybe in its own resolve, if it’s just to use as a tool).
s
Hmmm I do see
entry_point
which seems to allow me to call the script entry for an arbitrary package
it’s not clear to me how to pass in the arguments that I need there
b
Ah, hm. So it sounds like
composer
is effectively acting as a wrapper for
python
in the sense that it needs to pull in the environment for
training_script.py
in addition to whatever
composer
need? Some possibilities: • I'd guess one can probably package the whole (composer and the script) into a single PEX binary, and then somehow convince it to run the appropriate file... ◦ maybe: https://www.pantsbuild.org/v2.16/docs/reference-pex_binary#codeargscode (in 2.16+), but I don't know how that would refer to something within the pex? can it be specified as an importable module rather than a file path? • alternatively, can composer run a PEX directly? As in, package
training_script.py
into a PEX with its deps, and then let composer execute that (again might require composer to support executing importable things) • building explicit PEXes just to execute them locally isn't great, though, so maybe there's an alternative ways I'm at the limit of my pants knowledge, though, so someone else might have to jump in.
e
@broad-processor-92400's 1st bullet point above is correct. You can package arbitrary things in a PEX, both 1st and 3rd party, then pick a script or entry_point from either, and seal in any args or env you want to be active / injected when the PEX file is run.
s
Yeah the
composer
script is mainly to help with multiprocessing, so it spins up multiple processes for your ML training script to allow multigpu. I was able to create a Pex binary that targets the composer script, and run it successfully. However, trying to actually pass the training script it doesn’t seem to propagate the Python environment:
./pants run src:run_train -- src/train.py
for example tells me that it’s missing
torchvision
even though I explicitly include that dep in the binary definition. Is the issue that composer is creating another process which is then outside of the Pex environment when it forks?
e
Adding the following to your
pex_binary
target will net you the most compatible runtime environment to a vanilla venv:
execution_mode="venv", venv_site_packages_copies=True
You might try that.
s
Yes! I think that worked, thank you!