Wondering how this should be handled. I packaging ...
# general
e
Wondering how this should be handled. I packaging a python
pex_binary
in a
docker_image
. My python application includes a python package (
pybedtools
) which has some native components. My dev system is ubuntu21.04. The docker image base is python:3.8.12-slim. When I package the pex it builds components of
pybedtools
against the GLIBC 2.33 on my dev system, which then fails inside the docker image where 2.31 is installed:
Copy code
root@e107dc414f9c:/color# python3 $PEX_BINARY 
...
  File "/root/.pex/installed_wheels/bf89cafd4094cc8d47719f572e0edb81d1c39ef9/pybedtools-0.8.2-cp38-cp38-linux_x86_64.whl/pybedtools/__init__.py", line 9, in <module>
    from .cbedtools import (
ImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.33' not found (required by /root/.pex/installed_wheels/bf89cafd4094cc8d47719f572e0edb81d1c39ef9/pybedtools-0.8.2-cp38-cp38-linux_x86_64.whl/pybedtools/cbedtools.cpython-38-x86_64-linux-gnu.so)
root@e107dc414f9c:/color# ls -lah /lib/x86_64-linux-gnu/libc.so.6 
lrwxrwxrwx 1 root root 12 Oct  2 12:47 /lib/x86_64-linux-gnu/libc.so.6 -> libc-2.31.so
Possible solutions (none perfect): 1. Match my docker image base to my dev machine? (brittle) 2. Run the
./pants package path/to/Dockerfile
in a docker image? (gets convoluted fast) 3. Rebuild wheels somehow at docker image build time or runtime? (unclear how)
e
Option 2 we want to eventually natively support. The other option is to pre-build wheels in the same docker base image (using
pip wheel ...
say) and serve them up in a directory you let Pants see via:
Copy code
[python-repos]
repos = [
  "file:///network/mount/wheel/house",
  "<https://central.server/wheel/house>"
]
And then add
platforms
to your
pex_binary
target: https://www.pantsbuild.org/docs/reference-pex_binary#codeplatformscode
I can explain more about this 4th option if that seems like the best one. I'll hold back otherwise.
e
Hmm, Until #2 is built into pants it seems like #4 is the only way to go. We will wants devs and ci to all be able to build/run this image reproducibly
What should the platform for something like this be? glibc version is not one of PLATFORM-IMPL-PYVER-ABI
e
The easiest thing to do is install pex in the image and run:
Copy code
$ pex --platform help
Traceback (most recent call last):
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/platforms.py", line 48, in create
    platform, impl, version, abi = platform.rsplit(cls.SEP, 3)
ValueError: not enough values to unpack (expected 4, got 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/resolve/target_options.py", line 212, in configure
    convert_platforms(options.platforms)
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/resolve/target_configuration.py", line 31, in convert_platforms
    return tuple(_parsed_platform(platform) for platform in platforms or ()) if platforms else ()
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/resolve/target_configuration.py", line 31, in <genexpr>
    return tuple(_parsed_platform(platform) for platform in platforms or ()) if platforms else ()
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/resolve/target_configuration.py", line 26, in _parsed_platform
    return Platform.create(platform) if platform and platform != "current" else None
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/platforms.py", line 51, in create
    raise cls.InvalidPlatformError(
pex.platforms.InvalidPlatformError: Not a valid platform specifier: help

Platform strings must be in one of two forms:
1. Canonical: <platform>-<python impl abbr>-<python version>-<abi>
2. Abbreviated: <platform>-<python impl abbr>-<python version>-<abbr abi>

Given a canonical platform string for CPython 3.7.5 running on 64 bit linux of:
  linux-x86_64-cp-37-cp37m

Where the fields above are:
+ <platform>: linux-x86_64 
+ <python impl abbr>: cp
+ <python version>: 37
+ <abi>: cp37m

The abbreviated platform string is:
  linux-x86_64-cp-37-m

Some other canonical platform string examples:
+ OSX CPython: macosx-10.13-x86_64-cp-36-cp36m
+ Linux PyPy: linux-x86_64-pp-273-pypy_73.

These fields stem from wheel name conventions as outlined in
<https://www.python.org/dev/peps/pep-0427#file-name-convention> and influenced by
<https://www.python.org/dev/peps/pep-0425>.


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jsirois/.venv/pex/bin/pex", line 8, in <module>
    sys.exit(main())
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/bin/pex.py", line 680, in main
    target_configuration = target_options.configure(options)
  File "/home/jsirois/.venv/pex/lib/python3.10/site-packages/pex/resolve/target_options.py", line 215, in configure
    raise ArgumentTypeError(str(e))
argparse.ArgumentTypeError: Not a valid platform specifier: help

Platform strings must be in one of two forms:
1. Canonical: <platform>-<python impl abbr>-<python version>-<abi>
2. Abbreviated: <platform>-<python impl abbr>-<python version>-<abbr abi>

Given a canonical platform string for CPython 3.7.5 running on 64 bit linux of:
  linux-x86_64-cp-37-cp37m

Where the fields above are:
+ <platform>: linux-x86_64 
+ <python impl abbr>: cp
+ <python version>: 37
+ <abi>: cp37m

The abbreviated platform string is:
  linux-x86_64-cp-37-m

Some other canonical platform string examples:
+ OSX CPython: macosx-10.13-x86_64-cp-36-cp36m
+ Linux PyPy: linux-x86_64-pp-273-pypy_73.

These fields stem from wheel name conventions as outlined in
<https://www.python.org/dev/peps/pep-0427#file-name-convention> and influenced by
<https://www.python.org/dev/peps/pep-0425>.
Instead of "installing" you can just curl the Pex PEX and run that: https://github.com/pantsbuild/pex/releases/download/v2.1.62/pex
e
I have pex installed in my image, but I don't understand what
pex --platform help
is doing for me
e
Ah, remembered that wrong. That's just general help. Instead, run
pex --help
and read the
--platform
help string. It will tell you what your current platform is.
FYI, fwict the bugs to track this issue / #2 approach are: https://github.com/pantsbuild/pants/issues/13682 https://github.com/pantsbuild/pants/issues/13185 Neither hits this head on, but we definitely have informally talked about generally being able to run pants in a sidecar container.
e
Ahh I see. The full platform string is "manylinux_2_31_x86_64-cp-38-cp38", which includes the glibc version
For the next person who finds this, an even easier way to see it is
Copy code
root@a28014a7154c:/color# PEX_TOOLS=1 pex interpreter --verbose
{"path": "/usr/local/bin/python3.8", "requirement": "CPython==3.8.12", "platform": "manylinux_2_31_x86_64-cp-38-cp38"}
@loud-laptop-17949’s issue regarding the env tests run in is definitely related and probably solved by the same underlying support-to-be in pants. Being able to specify a docker image that the pants packaging process should be run in would be cool. Esp if that is inferable by the base image of the docker_image we are building the pex for.
👍 1
e
Agreed.
e
Assuming that support is several months away at least do you have any suggestions for building these wheels in the short term? I assume also in docker
e
If you have pex installed you probably have pex-tools installes and can just
pex-tools interpreter --verbose
(I wrote this stuff and I forgot about it!).
Ah, that may be a lie.
Assuming that support is several months away at least do you have any suggestions for building these wheels in the short term? I assume also in docker
Yeah, just
pip wheel --wheel-dir /here
where
/here/
is a volume you can get at from outside the container. Basically, docker aside, you want to use a production image machine for each production arch you have in the fleet (hopefully not too many!) to build all sdists you need into wheels. You then want to serve up all those wheels in either a flat network mounted directory or via an http(s) server with an index page.
Large orgs I know about generally had a service that looked at a requirements.txt or several, and did all this in an automated way so that you could change deps in requirements files and push that commit and get wheels available for all prod platforms in O(10 minutes).
e
is there a way to get the rest of the args for
pip wheel
(package names and pegged versions from constaints) from the pex or from pants?
e
If you want to stick to Pex tooling instead of Pip, you could run these in the container:
Copy code
pex [requirements] --include-tools -orepo.pex
PEX_TOOLS=1 ./repo.pex repository extract --repo /here/
e
All right, well I will dig more into this tomorrow. Thanks for all the help
e
is there a way to get the rest of the args for 
pip wheel
 (package names and pegged versions from constaints) from the pex or from pants?
The original requirements are in
PEX_TOOLS=1 ./my.pex info | jq .requirements
. The pins are a bit harder to get, but you can look at jq
.distributions
keys which are the actual wheel names.
e
Just trying to close the loops somehow. 1. pex built locally 2. Dockerfile copies in pex from pants build context 3. first stage of dockerfile runs some magic set of commands to rebuild the pex with wheels that are compatible with the env 4. final stage of Dockerfile copies in the new pex and runs it
It would be nice to not have to parse the versions out of the wheel paths, but I'll figure something out
Hmmm, I guess my above scheme would end up rebuilding all the wheels every time anything in the pex changed 😕
e
Yeah, Your steps though are not how I've seen this done, I've seen: 1. clone the repo on target arch machine 2. build all the requirements in the repo on the target arch machine and serve up 3. back on dev machine with pants configured to see --find-links repo populated by 2, build a PEX leveraging platforms 4. copy the pex built locally - but for the target platform - into the target platform
You should definitely be able to get your suggested flow working though.
w
i’ve just opened https://github.com/pantsbuild/pants/issues/14145 for native support for cross-building dists to be used in PEXes/docker-images if that’s something that would be helpful for you, make sure to vote for it in the surveying we’re doing over the next week (i already went ahead and submitted it as an idea): https://groups.google.com/g/pants-devel/c/UFt3Os--6ps/m/FCjGTnlRBQAJ
e
I will! thank you
For anyone finding this I ended up going with running pants in a docker image. So far it is working. `pants.toml`:
Copy code
[docker]
build_args = [
    "BASE_PYTHON_IMAGE=python:3.8.12-slim"
]
`build/pants/docker_packager/Dockerfile`:
Copy code
ARG BASE_PYTHON_IMAGE
FROM ${BASE_PYTHON_IMAGE}

RUN sed -iE 'p; s/^deb /deb-src /' /etc/apt/sources.list && \
    apt-get update && \
    apt-get install -y curl git-core <http://docker.io|docker.io> unzip build-essential python3-distutils && \
    apt-get -y build-dep bedtools

WORKDIR /src
ENV COLOR_ROOT /src
ENV PANTS_PYTHON_BOOTSTRAP_SEARCH_PATH="['<PATH>']"

ENTRYPOINT ["./pants", "package"]
build/pants/docker_packager/packager.sh
Copy code
#!/bin/bash -eux

# Runs the pants package goal on a target from inside a docker container based on BASE_PYTHON_IMAGE.
# Usage example:
# $ build/pants/docker_packager/packager.sh <target>
#
# This is done so that wheels will be built on a platform that matches the final environment. Eventually pants will support this natively.
# <https://github.com/pantsbuild/pants/issues/13682>
# <https://pantsbuild.slack.com/archives/C046T6T9U/p1641875149141400>

cd "$SRC_ROOT"

./pants \
    run \
    --docker-run-args="-v $SRC_ROOT:/src -v pants_builder_cache:/root/.cache -v /var/run/docker.sock:/var/run/docker.sock -e COMMIT_SHA" \
    build/pants/docker_packager/Dockerfile \
    -- \
    "$@"