PEX can ship packages and optionally can include a single co Pants #general

PEX can ship packages and optionally can include a...

abundant-autumn-67998

12/03/2022, 5:46 PM

PEX can ship packages and optionally can include a single console script as the entry point. Is there a way to have PEX ship multiple console scripts which can be later extracted using

PEX_TOOLS=1

? We ship a set of dependencies and sometimes they invoke commands with

subprocess

instead of a Python API. So I'm looking for an easy way to ship the console scripts for each distribution as well.

enough-analyst-54434

12/03/2022, 5:57 PM

Well, if you re-invoke a PEX (The path to which is always available via the

PEX

env var), You can set

PEX_SCRIPT=foo

in the environment and that alternate console script will be used. There is also the more generic busybox support provided by the conscript library, works with PEX but you can use it generically: https://pypi.org/project/conscript/

enough-analyst-54434

12/03/2022, 5:58 PM

Do either of those two solve your problem?

enough-analyst-54434

12/03/2022, 5:58 PM

If you were not aware of

PEX_SCRIPT

try

pex --help-variables

enough-analyst-54434

12/03/2022, 6:00 PM

Well, and, also, you can build

--venv prepend

pexes. On 1st run the PEX is turned into a noraml venv in the PEX_ROOT. That venv will have all console scripts installed in its

bin/

dir and pre-pended to PATH.

enough-analyst-54434

12/03/2022, 6:00 PM

That is probably the most natural way. Its also the fastest PEX warm mode. The cold start suffers the cost of building the venv once of course.

enough-analyst-54434

12/03/2022, 6:02 PM

If you want to squeeze ~50ms off of warm

--venv

overhead and your deploy environments include

/bin/sh

, throw in

--sh-boot

. That drops ~50ms wasted re-execing into the pre-built venv via Python by doing it via

sh

which takes ~1ms.

enough-analyst-54434

12/03/2022, 6:03 PM

So, lots of tools here. Perhaps too many. Hopefully one of these works for your situation.

abundant-autumn-67998

12/03/2022, 6:48 PM

Ah interesting stuff! We don't control when and what command is invoked so

PEX_SCRIPT

may not work but the other two options look useful, specially the

--venv prepend

- we could just run the pex once after deploying and the PATH will provide the scripts. Will try it out later.

enough-analyst-54434

12/03/2022, 7:08 PM

If you want control over the deploy / cold start you can always build the PEX with

--pex-tools

and then run

PEX_TOOLS=1 the.pex venv [many options] right/here

That will create a venv from the PEX

right/here

and there will be a `right/here/pex`script you can use to run the venv just like the originating PEX would have run - but, of course, faster.

abundant-autumn-67998

12/04/2022, 8:00 AM

I did a quick experiment and looks like

--venv=prepend

will "just work". Our code is split across two pex files. We run `source.pex`and specify the

deps.pex

using

PEX_PATH

. The console scripts are provided by

deps.pex

. If I build

source.pex

with

--venv=prepend

, looks like all scripts from both pexes are put in the venv's bin directory added to the PATH.

enough-analyst-54434

12/05/2022, 2:45 AM

Aha, excellent.

abundant-autumn-67998

12/07/2022, 10:06 PM

We have not enabled

--venv=prepend

and I just want to make sure I understand all the implications. At runtime we invoke the pex with

subprocess

. It appears to unpack the dependencies into

~/.pex/unzipped_pexes/<hash>

. If we switch to doing

--venv=prepend

builds, I believe the pex can still be run the same way but will unpack the deps into

~/.pex/venvs/<hash>/.../site-packages

and also produce surrounding folders such as

bin

etc. We never directly look at

~/.pex

ever so I'm assuming this will require no change at the run site and only a build flag change. Is this correct?

enough-analyst-54434

12/07/2022, 10:13 PM

So,

--venv

--venv prepend

and

--venv append

all create

~/.pex/venvs/...

and run from those. As part of the process of creating a venv on the 1st ever run, a

~/.pex/unzipped_pexes/...

entry happens to get made, but never is used again. The latter part is correct,

--venv *

nets you a

bin/

that includes console scripts, they just won't be on the

PATH

unless you use

{append,prepend}

. In that not on the

PATH

case, you'd need something like:

Copy code

subprocess.run(args=[PurePath(sys.executable).parent / "console_script_name", ...], ...)

enough-analyst-54434

12/07/2022, 10:16 PM

If the help docs for

--venv

don't make that clear, (I just re-read them and of course they seem clear to the author!), then I'd welcome a PR fixing those up. At some point here in the new year my stack will clear and I'll be circling back to documenting the `pex3 lock`tools and everything else in prep to cut Pex 3.x. The readthedocs is very old and ~not updated since the Pex 1.x days before I took on maintainership.

enough-analyst-54434

12/07/2022, 10:19 PM

@abundant-autumn-67998 I can't recall if I've pointed out scies before. Also relatively soon PEXes will gain the option of including the Python interpreter, or including it lazily. You may have locked down target machines, in which case you don't need this and plain old PEXes are fine, But in case not.

abundant-autumn-67998

12/07/2022, 11:48 PM

The

--venv

docs are clear and it will give us the feature we want (console scripts). I was just trying to see if turning it on may break us in some unexpected way because of some assumption we have that only exists during runtime in the non-venv case. I think it's a safe change though and will try it out.

abundant-autumn-67998

12/07/2022, 11:51 PM

scie-jump

looks very interesting. (Though I don't think we need it though in our current setup)

enough-analyst-54434

12/08/2022, 12:18 AM

Gotcha. No, to the contrary,

--venv

--venv-site-packages-copies

) gives you the closest thing to a fully normal hand made venv and should be the most compatible with the most code. The normal

--venv

symlinks 3rdparty dists from the

~/.pex/installed_wheels/...

dirs into the venv `site-packages/`dir and this can confuse a few distributtions out there in the wild; so you may need to turn on copies, which of course just wastes space that might otherwise be saved by re-using via symlink.

enough-analyst-54434

12/08/2022, 12:20 AM

The copies setting tries to hardlink; so it's not bad, but falls back to copies if you somehow have a mount point in the middle of the

~/.pex

tree.

abundant-autumn-67998

12/08/2022, 12:20 AM

most of our testing so far has been with pex files, but without '--venv' - so that's our tested+working state.

enough-analyst-54434

12/08/2022, 12:21 AM

The hardlink trees are still bigger than symlink trees.

enough-analyst-54434

12/08/2022, 12:21 AM

Ok - gotcha. Yeah, you're more likely to have already hit issues than you will be using --venv.

enough-analyst-54434

12/08/2022, 12:22 AM

The only reasons --venv is not the default are: + historical: this only was born in the last few years and I don't want to break existing users by changing defaults. + slower cold start.

enough-analyst-54434

12/08/2022, 12:22 AM

Much faster warm though!

enough-analyst-54434

12/08/2022, 12:23 AM

ANd I think cold starts average 2x slower than vanilla PEX.

abundant-autumn-67998

12/08/2022, 12:23 AM

i think links will be fine, we don't peek into or manage the pex unpacked dirs at all. i wonder if there will be an issue if the entry point or number of processes changes - i don't think we depend on that either but would be good to know.

enough-analyst-54434

12/08/2022, 12:24 AM

if the entry point or number of processes changes

I don't understand what that means.

abundant-autumn-67998

12/08/2022, 12:24 AM

Yeah, you're more likely to have already hit issues than you will be using --venv

indeed - the console scripts was one issue

abundant-autumn-67998

12/08/2022, 12:25 AM

like if the first script executed is different, or if there's an extra level of indirection in first process -> subprocess -> subprocess

enough-analyst-54434

12/08/2022, 12:27 AM

Ah, yeah, so the breakdown: + PEX zip: 1 re-exec into

~/.pex/unzipped_pexes

on every run. + PEX --venv: 1 re-exec into

~/.pex/unzipped_pexes

oin 1st run to build venv then 1 re-exec into

~/.pex/venvs/...

, On 2+ run, just 1 re-exec straight into

~/.pex/venvs/...

unless it has been nuked.

abundant-autumn-67998

12/08/2022, 12:27 AM

ah excellent - yeah that should be fine.

abundant-autumn-67998

12/08/2022, 12:27 AM

btw one thing i'm still trying to figure out is better stack traces without the long directory names. i think the right solution there is to manage the venv directory ourselves. if we unpack it with

PEX_TOOLS

into a short venv directory, the stacks traces will be nicer.

enough-analyst-54434

12/08/2022, 12:28 AM

Sure, yes.

enough-analyst-54434

12/08/2022, 12:28 AM

And by building yourself you save 50ms re-exec overhead into the venv.

enough-analyst-54434

12/08/2022, 12:29 AM

You can gain ~49ms back with

--sh-boot

but still ugly paths.

abundant-autumn-67998

12/08/2022, 12:29 AM

ah ok - because we can call the

venv/.../pex

directly?

enough-analyst-54434

12/08/2022, 12:29 AM

Correct. If you can do that; that's what you want.

👍 1

enough-analyst-54434

12/08/2022, 12:31 AM

The only reason you want to run the PEX file directly is for scp deploys with no other action needed.

abundant-autumn-67998

12/08/2022, 12:32 AM

it also made it easier by not having to manage the directories and their atomic, consistent creation.

abundant-autumn-67998

12/08/2022, 12:32 AM

(not super hard but one less thing to worry about)

enough-analyst-54434

12/08/2022, 12:32 AM

And on deploys. IIUC you use Pex over Docker for faster iteration / deployment. Were you aware of

--layout packed

? DO you use that layout + rsync?

abundant-autumn-67998

12/08/2022, 12:33 AM

we use pex zip because we need a repeatable snapshot and having a file is simple - we put it in s3. then any time we can run the file from s3. we put the pex hash in the file name.

enough-analyst-54434

12/08/2022, 12:33 AM

Ok.

abundant-autumn-67998

12/08/2022, 12:33 AM

basically instead of just the docker image tag, we now have docker image + pex tag.

enough-analyst-54434

12/08/2022, 12:33 AM

We had to introduce packed for Pants since shipping a whole PEX wasted space and time for remote (and local) caching in really observable ways.

enough-analyst-54434

12/08/2022, 12:36 AM

Pants is probably an odd case. It can generate 100s to 1000s of PEXes very quickly.

enough-analyst-54434

12/08/2022, 12:37 AM

And the overheads become clear quickly since its user-interactive.

abundant-autumn-67998

12/08/2022, 12:38 AM

yeah i looked at different layouts - very useful but not for us (yet)

enough-analyst-54434

12/08/2022, 12:40 AM

Ok, great. As long as you're active exploring the CLI help, you're in a good spot. It sounds like you've delved. If you do have questions just continue to speak up. Those good high level docs are coming, just not yet.

abundant-autumn-67998

12/08/2022, 12:42 AM

yeah i looked at pex once first and moved on because i didn't see too much in the docs. then i came back to it later and explored more. there's quite a bit of useful stuff in there.

Open in Slack

Previous Next