Hi, what would be the way to debug this failing st...
# general
r
Hi, what would be the way to debug this failing stage in multi stage docker build with PEX when running?
Copy code
ERROR [deps 4/4] RUN PEX_TOOLS=1 /usr/local/bin/python3.9 ./infra_plan_batch_job.pex venv     --scope=deps --compile ./infra_plan_batch_job
------
 > [deps 4/4] RUN PEX_TOOLS=1 /usr/local/bin/python3.9 ./infra_plan_batch_job.pex venv     --scope=deps --compile ./infra_plan_batch_job:
#19 33.66 received exit code 1 during execution of `['/usr/local/bin/python3.9', '-s', '-E', '-m', 'compileall', './infra_plan_batch_job']` while trying to execute `['/usr/local/bin/python3.9', '-s', '-E', '-m', 'compileall', './infra_plan_batch_job']`
------
executor failed running [/bin/sh -c PEX_TOOLS=1 /usr/local/bin/python3.9 ./infra_plan_batch_job.pex venv     --scope=deps --compile ./infra_plan_batch_job]: exit code: 1
It’s inspired from this https://pex.readthedocs.io/en/latest/recipes.html#pex-app-in-a-container
Is there something like verbose option?
I think some new dependency is breaking it but it works locally when I build and run the pex_binary outside of docker.
Actually running this step outside of docker also throws the same exact error which isn’t really helpful
@enough-analyst-54434 any pointers?
I found the package which when I add leads to this error. It’s the awswrangler which breaks it. It can be reproduced using https://github.com/ShantanuKumar/pants-multi-poetry
Copy code
./pants package src/package-a/package_a:pex_package_a
mkdir pex-deps-compiled
PEX_TOOLS=1 /usr/local/bin/python3.9 dist/src.package-a.package_a/pex_package_a.pex venv --scope=deps --compile ./pex-deps-compiled
e
Sorry for the delay @refined-addition-53644, I missed this. You can debug like this:
Copy code
$ pex-deps-compiled/bin/python -mcompileall pex-deps-compiled 2>&1 | grep -C5 SyntaxError
Compiling 'pex-deps-compiled/lib/python3.9/site-packages/aenum/__init__.py'...
Compiling 'pex-deps-compiled/lib/python3.9/site-packages/aenum/_py2.py'...
***   File "pex-deps-compiled/lib/python3.9/site-packages/aenum/_py2.py", line 5
    raise exc, None, tb
             ^
SyntaxError: invalid syntax

Compiling 'pex-deps-compiled/lib/python3.9/site-packages/aenum/_py3.py'...
Listing 'pex-deps-compiled/lib/python3.9/site-packages/aenum/doc'...
Compiling 'pex-deps-compiled/lib/python3.9/site-packages/aenum/test.py'...
Compiling 'pex-deps-compiled/lib/python3.9/site-packages/aenum/test_v3.py'...
So, yeah. That suggests the `--compile`option probably ought to both emit stderr on failure but also probably not fail. That
aenum/_py2.py
file is presumably only loaded by `aenum`when running under Python 2 already.
@refined-addition-53644 is there any chance you want to take a swing at fixing this?: https://github.com/pantsbuild/pex/issues/2001
r
yeah I can give it a try
One thing I have realized is that this kind of compiled multi-stage docker build is actually quite bigger than just using a single PEX. Is this expected?
e
I think your accounting must be off. If you use a PEX (which is a compressed zip), it always extracts itself before running the 1st time, and so the total size is PEX file + PEX_ROOT/... items and that should definitely be larger.
This is a classic space vs. time tradeoff, pre-extraction and compilation trades for speed later.
It may be that you don't care about the extracted size, just the PEX size (really image size).
Docker images are both super cool and an indictment of the state of software
Its a huge hammer.
r
yeah that was my guess too. Although I was using
pex_binary
with these options
Copy code
layout = "packed",
    execution_mode = "venv",
I thought this leads to unzipped PEX, right?
e
Not really. If you poke at the directory of the packed PEX, you'll see something like:
Copy code
packed.pex/
  .bootstrap  # <- normally a zip directory with lots of files in it, now a zip file
  .deps/
    dependency1.whl # <- normally a zip directory with lots of files in it, now a zip file
    ...
  PEX-INFO
  __main__.py
  <your loose code>
r
yeah I did poke at the directory but most probably didn’t understand everything what’s in there. Thanks for the clarification
e
Basically packed is a hack for Pants. Others could gain use of it for, for example, rsync'ing PEXes more efficiently, but the problem back over in Pants is the LMDB store it uses for caching files just really can't handle either caching a whole PEX or a fully loose PEX with O(10k) files in a performant way. The former wastes a ton of space - each PEX file is totally different as a whole file, but may share tons of wheels with other PEX files, and the latter is just too many files to materialize from the LMDB in Pants sandboxes quickly. The overhead is huge. The packed format strikes a balance in the middle, where installed wheels are zipped up and stored as single files in LMDB, and thus shareable across PEXes that use the same wheel.
🙌 1
1
Ideally we'd be storing loose PEXes (another
--layout
) or else raw venvs in Pants' LMDB if it had no performance impact. That would be much simpler.