Couple questions for pex_binary. (1) If your exec...
# general
n
Couple questions for pex_binary. (1) If your exec requires env setup (due to the nature of the hosts it is expected to run on), is a script entry point the preferred way of achieving this? If so, how do you direct it to run your module afterward (can it acquire the python exec/shebang and assume it’s in a relative root?). It seems none of the various PATH-like options in PEX docs are for this purpose. What about an equivalent shell “binary” built side-by-side that just invokes the PEX after env setup (could be better option as the shell binary can be used only when necessary). This is analogous to test and specify additional env vars before testing. (2) When specifying the output_path, is there any templating supported so you can get similarly generic output filenames but just specified to your liking (e.g., without src. prefix, or perhaps in a flat file structure where the dir name becomes part of the filenames)? (3) When specifying platforms, is there support for manylinux%YYYY%? We are targeting older REHL6 and REHL7 platforms (incidentally Pants only runs on REHL7 and so the default 3rd party dependencies it installs with pip are all manylinux2014+ and can’t run on REHL6–I know we can get fancier with the Python requirements/constraints/dependencies but that seems a lot harder than just saying the platform is X and hoping the build job pulls the right wheels). Thanks
f
For (1), I generally have my main modules define a
main
method and then put:
Copy code
if __name__ == "__main__":
    main()
at the bottom of the file. If you use this pattern, then your PEX entry script can simply import that
main
method and call it directly.
For (2), there is no such templating currently, although it would be straightforward enough to add to Pants. (Specifically, would be added in this function https://github.com/pantsbuild/pants/blob/5086a781a7ca5ca81a0661dd9a5840104a316a28/src/python/pants/core/goals/package.py#L60) Could you open an issue at https://github.com/pantsbuild/pants/issues/new/choose and suggest what kind of templating you would like to see?
n
@fast-nail-55400 By env setup I meant certain environment variables need to be defined (like LD_LIBRARY_PATH) and some system modules ran before the PEX is executed.
f
how/where do you deploy the PEX?
(useful to know in case your deployment environment can point toward a solution)
n
For example our hosts don’t have common system binaries (such as libfortran) or even Python installed locally, they have to be “imported” from AFS. Also some internal libs make assumptions we have no control over about which import modules have been ran in order to locate resources. Some of these issues can be resolved by defining resource dependencies, overriding the Python shebang, etc., but having an env setup entry point before executing the PEX seems cleaner and more comprehensive in the long-term and avoids having to customize each pex target.
e
For the hosts that don't have Python installed locally, PEX is not the answer since it, at base, requires a python interpreter. It sounds like you're looking for a different executable package type all together?
Perhaps you want to prepend a bash script to a PEX zip?
You can build one of those on the command line with
cat
and it will work. Pants doesn't currently give you a way to build one though.
n
@enough-analyst-54434 I think it is still a good solution. Python is “installed” by putting the installation path in PATH. I guess the way to think of it, when a host restarts it should run some setup script to make things like Python accessible. But we can’t make that assumption, so want to run a shell script first.
@enough-analyst-54434 Hmm that is an interesting idea. I suppose after running all dist jobs we can write a custom script that prepends a shell script to each zip identified as a PEX?
e
Yeah, for now that's how you'd have to do it. Pants supports custom rules 1st class though, so you could also make this part of the
package
goal in your repo.
I'm not sure how proof of concepty your current effort is. If it's preliminary, then a custom post processing script is probably the way to go. If you're further along, I'm happy to help with custom rule guidance.
Unfortunately the manylinux configuration is currently global but could easily be made local on
pex_binary
if variance is needed.
f
I guess the way to think of it, when a host restarts it should run some setup script to make things like Python accessible. But we can’t make that assumption, so want to run a shell script first.
Are your hosts using a config management solution like chef or puppet?