PEX can ship packages and optionally can include a...
# general
a
PEX can ship packages and optionally can include a single console script as the entry point. Is there a way to have PEX ship multiple console scripts which can be later extracted using
PEX_TOOLS=1
? We ship a set of dependencies and sometimes they invoke commands with
subprocess
instead of a Python API. So I'm looking for an easy way to ship the console scripts for each distribution as well.
e
Well, if you re-invoke a PEX (The path to which is always available via the
PEX
env var), You can set
PEX_SCRIPT=foo
in the environment and that alternate console script will be used. There is also the more generic busybox support provided by the conscript library, works with PEX but you can use it generically: https://pypi.org/project/conscript/
Do either of those two solve your problem?
If you were not aware of
PEX_SCRIPT
try
pex --help-variables
.
Well, and, also, you can build
--venv prepend
pexes. On 1st run the PEX is turned into a noraml venv in the PEX_ROOT. That venv will have all console scripts installed in its
bin/
dir and pre-pended to PATH.
That is probably the most natural way. Its also the fastest PEX warm mode. The cold start suffers the cost of building the venv once of course.
If you want to squeeze ~50ms off of warm
--venv
overhead and your deploy environments include
/bin/sh
, throw in
--sh-boot
. That drops ~50ms wasted re-execing into the pre-built venv via Python by doing it via
sh
which takes ~1ms.
So, lots of tools here. Perhaps too many. Hopefully one of these works for your situation.
a
Ah interesting stuff! We don't control when and what command is invoked so
PEX_SCRIPT
may not work but the other two options look useful, specially the
--venv prepend
- we could just run the pex once after deploying and the PATH will provide the scripts. Will try it out later.
e
If you want control over the deploy / cold start you can always build the PEX with
--pex-tools
and then run
PEX_TOOLS=1 the.pex venv [many options] right/here
That will create a venv from the PEX
right/here
and there will be a `right/here/pex`script you can use to run the venv just like the originating PEX would have run - but, of course, faster.
a
I did a quick experiment and looks like
--venv=prepend
will "just work". Our code is split across two pex files. We run `source.pex`and specify the
deps.pex
using
PEX_PATH
. The console scripts are provided by
deps.pex
. If I build
source.pex
with
--venv=prepend
, looks like all scripts from both pexes are put in the venv's bin directory added to the PATH.
e
Aha, excellent.
a
We have not enabled
--venv=prepend
and I just want to make sure I understand all the implications. At runtime we invoke the pex with
subprocess
. It appears to unpack the dependencies into
~/.pex/unzipped_pexes/<hash>
. If we switch to doing
--venv=prepend
builds, I believe the pex can still be run the same way but will unpack the deps into
~/.pex/venvs/<hash>/.../site-packages
and also produce surrounding folders such as
bin
etc. We never directly look at
~/.pex
ever so I'm assuming this will require no change at the run site and only a build flag change. Is this correct?
e
So,
--venv
,
--venv prepend
and
--venv append
all create
~/.pex/venvs/...
and run from those. As part of the process of creating a venv on the 1st ever run, a
~/.pex/unzipped_pexes/...
entry happens to get made, but never is used again. The latter part is correct,
--venv *
nets you a
bin/
that includes console scripts, they just won't be on the
PATH
unless you use
{append,prepend}
. In that not on the
PATH
case, you'd need something like:
Copy code
subprocess.run(args=[PurePath(sys.executable).parent / "console_script_name", ...], ...)
If the help docs for
--venv
don't make that clear, (I just re-read them and of course they seem clear to the author!), then I'd welcome a PR fixing those up. At some point here in the new year my stack will clear and I'll be circling back to documenting the `pex3 lock`tools and everything else in prep to cut Pex 3.x. The readthedocs is very old and ~not updated since the Pex 1.x days before I took on maintainership.
@abundant-autumn-67998 I can't recall if I've pointed out scies before. Also relatively soon PEXes will gain the option of including the Python interpreter, or including it lazily. You may have locked down target machines, in which case you don't need this and plain old PEXes are fine, But in case not.
a
The
--venv
docs are clear and it will give us the feature we want (console scripts). I was just trying to see if turning it on may break us in some unexpected way because of some assumption we have that only exists during runtime in the non-venv case. I think it's a safe change though and will try it out.
Ah
scie-jump
looks very interesting. (Though I don't think we need it though in our current setup)
e
Gotcha. No, to the contrary,
--venv
(+
--venv-site-packages-copies
) gives you the closest thing to a fully normal hand made venv and should be the most compatible with the most code. The normal
--venv
symlinks 3rdparty dists from the
~/.pex/installed_wheels/...
dirs into the venv `site-packages/`dir and this can confuse a few distributtions out there in the wild; so you may need to turn on copies, which of course just wastes space that might otherwise be saved by re-using via symlink.
The copies setting tries to hardlink; so it's not bad, but falls back to copies if you somehow have a mount point in the middle of the
~/.pex
tree.
a
most of our testing so far has been with pex files, but without '--venv' - so that's our tested+working state.
e
The hardlink trees are still bigger than symlink trees.
Ok - gotcha. Yeah, you're more likely to have already hit issues than you will be using --venv.
The only reasons --venv is not the default are: + historical: this only was born in the last few years and I don't want to break existing users by changing defaults. + slower cold start.
Much faster warm though!
ANd I think cold starts average 2x slower than vanilla PEX.
a
i think links will be fine, we don't peek into or manage the pex unpacked dirs at all. i wonder if there will be an issue if the entry point or number of processes changes - i don't think we depend on that either but would be good to know.
e
if the entry point or number of processes changes
I don't understand what that means.
a
Yeah, you're more likely to have already hit issues than you will be using --venv
indeed - the console scripts was one issue
like if the first script executed is different, or if there's an extra level of indirection in first process -> subprocess -> subprocess
e
Ah, yeah, so the breakdown: + PEX zip: 1 re-exec into
~/.pex/unzipped_pexes
on every run. + PEX --venv: 1 re-exec into
~/.pex/unzipped_pexes
oin 1st run to build venv then 1 re-exec into
~/.pex/venvs/...
, On 2+ run, just 1 re-exec straight into
~/.pex/venvs/...
unless it has been nuked.
a
ah excellent - yeah that should be fine.
btw one thing i'm still trying to figure out is better stack traces without the long directory names. i think the right solution there is to manage the venv directory ourselves. if we unpack it with
PEX_TOOLS
into a short venv directory, the stacks traces will be nicer.
e
Sure, yes.
And by building yourself you save 50ms re-exec overhead into the venv.
You can gain ~49ms back with
--sh-boot
but still ugly paths.
a
ah ok - because we can call the
venv/.../pex
directly?
e
Correct. If you can do that; that's what you want.
👍 1
The only reason you want to run the PEX file directly is for scp deploys with no other action needed.
a
it also made it easier by not having to manage the directories and their atomic, consistent creation.
(not super hard but one less thing to worry about)
e
And on deploys. IIUC you use Pex over Docker for faster iteration / deployment. Were you aware of
--layout packed
? DO you use that layout + rsync?
a
we use pex zip because we need a repeatable snapshot and having a file is simple - we put it in s3. then any time we can run the file from s3. we put the pex hash in the file name.
e
Ok.
a
basically instead of just the docker image tag, we now have docker image + pex tag.
e
We had to introduce packed for Pants since shipping a whole PEX wasted space and time for remote (and local) caching in really observable ways.
Pants is probably an odd case. It can generate 100s to 1000s of PEXes very quickly.
And the overheads become clear quickly since its user-interactive.
a
yeah i looked at different layouts - very useful but not for us (yet)
e
Ok, great. As long as you're active exploring the CLI help, you're in a good spot. It sounds like you've delved. If you do have questions just continue to speak up. Those good high level docs are coming, just not yet.
a
yeah i looked at pex once first and moved on because i didn't see too much in the docs. then i came back to it later and explored more. there's quite a bit of useful stuff in there.