rich-london-74860
03/03/2023, 9:50 PM./pants test ::
in a docker container:
02:23:24.27 [ERROR] 1 Exception encountered:
ProcessExecutionFailure: Process 'Building 17 requirements for requirements.pex from the build-support/databricks_lock.txt resolve: boto3==1.16.7, enigma-namedframes~=1.0.2, matplotlib==3.4.2, mlflow==1.20.2, numpy<1.24,>=1.20, pandas==1.2.4, plotly==5.1.0, probablepeople, protobuf==3.17.2, pyarrow==4.0.0, pyspark-test, pyspark==3.1.2, pytest, scikit-learn==0.24.1, scipy~=1.6.0, tldextract, types-setuptools' failed with exit code 1.
stdout:
stderr:
Build of BuildRequest(target=LocalInterpreter(id='usr.bin.python3.8', platform=Platform(platform='manylinux_2_27_x86_64', impl='cp', version='3.8.0', version_info=(3, 8, 0), abi='cp38'), marker_environment=MarkerEnvironment(implementation_name='cpython', implementation_version='3.8.0', os_name='posix', platform_machine='x86_64', platform_python_implementation='CPython', platform_release='5.15.49-linuxkit', platform_system='Linux', platform_version='#1 SMP Tue Sep 13 07:51:46 UTC 2022', python_full_version='3.8.0', python_version='3.8', sys_platform='linux'), interpreter=PythonInterpreter('/usr/bin/python3.8', PythonIdentity('/usr/bin/python3.8', 'cp38', 'cp38', 'manylinux_2_27_x86_64', (3, 8, 0)))), source_path='/root/.cache/pants/named_caches/pex_root/downloads/5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65/pyspark-3.1.2.tar.gz', fingerprint='5e25ebb18756e9715f4d26848cc7e558035025da74b4fc325a0ebc05ff538e65') produced 2 artifacts; expected 1:
0. cp38-cp38-manylinux_2_27_x86_64.3ec80d51ce26438a86c97b71d562e96e
1. pyspark-3.1.2-py2.py3-none-any.whl
It looks like there is some function that is expected to create a single artifact (likely 1 above), but winds up creating 2 artifacts in this environment.
If I include the parameter --keep-sandboxes=on_failure
, then it preserves a directory with the following files:
./__run.sh
./source_files
./.tmp
./pex
./.cache
./.cache/pex_root
./build-support
./build-support/databricks_lock.txt
__run.sh
includes this command:
/usr/bin/python3.8 ./pex --tmpdir .tmp --jobs 6 --python-path $'/databricks/python3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' --output-file requirements.pex --no-emit-warnings --python /usr/bin/python3.8 $'--sources-directory=source_files' $'boto3==1.16.7' $'enigma-namedframes~=1.0.2' $'matplotlib==3.4.2' $'mlflow==1.20.2' $'numpy<1.24,>=1.20' $'pandas==1.2.4' $'plotly==5.1.0' probablepeople $'protobuf==3.17.2' $'pyarrow==4.0.0' pyspark-test $'pyspark==3.1.2' pytest $'scikit-learn==0.24.1' $'scipy~=1.6.0' tldextract types-setuptools --lock build-support/databricks_lock.txt --no-pypi $'--index=<https://pypi.org/simple/>' $'--index=https://*****:*****@**********/pypi/pypi-local/simple' --manylinux manylinux2014 --layout packed
which outputs the stderr
message from above.
Although this fails in a docker container locally, this same command in the same docker image works in CI/CD.
No one else that I work with has reported the same error and I have tried clearing all of my caches.
Installing all of the dependencies listed in that command in __run.sh
with pip install
works..
Lastly, this all used to work for me until very recently (I think earlier this week).=
Any thoughts?enough-analyst-54434
03/03/2023, 10:23 PMPEX_ROOT
is set to in __run.sh
and then run PEX_VERBOSE=9 ./__run.sh
and provide the full output?$ pyenv install 3.8.0
Downloading Python-3.8.0.tar.xz...
-> <https://www.python.org/ftp/python/3.8.0/Python-3.8.0.tar.xz>
Installing Python-3.8.0...
Installed Python-3.8.0 to /home/jsirois/.pyenv/versions/3.8.0
$ pex --python ~/.pyenv/versions/3.8.0/bin/python "pyspark==3.1.2" --no-binary --intransitive --ignore-errors -o pyspark-no-deps.pex
# The error is expected - I built the PEX with no deps, but the backtrace proves that pyspark comes from a "wheel" within the PEX.
$ ~/.pyenv/versions/3.8.0/bin/python pyspark-no-deps.pex -c 'import pyspark'
Traceback (most recent call last):
File "/home/jsirois/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 192, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/jsirois/.pyenv/versions/3.8.0/lib/python3.8/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/__main__.py", line 106, in <module>
bootstrap_pex(__entry_point__, execute=__execute__, venv_dir=__venv_dir__)
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex_bootstrapper.py", line 615, in bootstrap_pex
pex.PEX(entry_point).execute()
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 560, in execute
sys.exit(self._wrap_coverage(self._wrap_profiling, self._execute))
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 467, in _wrap_coverage
return runner(*args)
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 498, in _wrap_profiling
return runner(*args)
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 581, in _execute
return self.execute_interpreter()
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 663, in execute_interpreter
return self.execute_content("-c <cmd>", content, argv0="-c")
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 774, in execute_content
return cls.execute_ast(name, program, argv0=argv0)
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/pex.py", line 792, in execute_ast
exec_function(program, globals_map)
File "/home/jsirois/.pex/unzipped_pexes/64a6095bab8c21d882462ac5fe40fb4c06230499/.bootstrap/pex/compatibility.py", line 109, in exec_function
exec (ast, globals_map, locals_map)
File "-c <cmd>", line 1, in <module>
File "/home/jsirois/.pex/installed_wheels/9490046e8f900d1d1cadb2ac3da6151c739a01c031079b8a4760254dba0ed3bd/pyspark-3.1.2-py2.py3-none-any.whl/pyspark/__init__.py", line 53, in <module>
from pyspark.rdd import RDD, RDDBarrier
File "/home/jsirois/.pex/installed_wheels/9490046e8f900d1d1cadb2ac3da6151c739a01c031079b8a4760254dba0ed3bd/pyspark-3.1.2-py2.py3-none-any.whl/pyspark/rdd.py", line 34, in <module>
from pyspark.java_gateway import local_connect_and_auth
File "/home/jsirois/.pex/installed_wheels/9490046e8f900d1d1cadb2ac3da6151c739a01c031079b8a4760254dba0ed3bd/pyspark-3.1.2-py2.py3-none-any.whl/pyspark/java_gateway.py", line 29, in <module>
from py4j.java_gateway import java_import, JavaGateway, JavaObject, GatewayParameters
ModuleNotFoundError: No module named 'py4j'
rich-london-74860
03/04/2023, 1:26 AMPEX_VERBOSE=9 ./__run.sh
FWIW 3.8.0 is an odd / scary version of Python to be using. Both very old and very new; i.e. 3.8 is all the way up to 3.8.16 - there have been many bugs fixed IOW.This is in fact not the first time that you’ve brought up the problems with 3.8 to me 😆 https://pantsbuild.slack.com/archives/C046T6T9U/p1676431474327309?thread_ts=1676429638.205149&cid=C046T6T9U Unfortunately, moving off of 3.8 would be a large endeavor
enough-analyst-54434
03/04/2023, 1:55 AMrich-london-74860
03/04/2023, 2:03 AMenough-analyst-54434
03/04/2023, 2:04 AMrich-london-74860
03/04/2023, 2:09 AM./pants test ::
I get the error.
If I cd
to the sandbox temp directory and run __run.sh
, then error also happens
If I do this:
Can you `rm -rf`the directoryand then runis set to inPEX_ROOT
__run.sh
__run.sh
, it worksenough-analyst-54434
03/04/2023, 2:11 AMrm -rf
the PEX_ROOT) then, but include an export of PEX_VERBOSE=9? Basically I'd like to see the error you're seeing, but with more detail. That's all I'm aiming for here.rich-london-74860
03/04/2023, 2:16 AMPEX_ROOT
enough-analyst-54434
03/04/2023, 2:19 AMpants.toml
and re-running though?:
[pex-cli]
version = "v2.1.125"
known_versions = [
"v2.1.125|macos_arm64|1da1ef933429f15b218c98c6b960f30adfd0221fc5284c1d8facac09923692f8|4080732",
"v2.1.125|macos_x86_64|1da1ef933429f15b218c98c6b960f30adfd0221fc5284c1d8facac09923692f8|4080732",
"v2.1.125|linux_x86_64|1da1ef933429f15b218c98c6b960f30adfd0221fc5284c1d8facac09923692f8|4080732",
"v2.1.125|linux_arm64|1da1ef933429f15b218c98c6b960f30adfd0221fc5284c1d8facac09923692f8|4080732"
]
This will just rule out some Pex fix between 2.1.108 and now. I don't remember anything related, but it seems worth a quick shot if you're game to try that.Yes, and as I mentioned in that thread, we do not really have a choice in the matter, this is the python version set by the platform we are using.Sorry about that. I forgot that context and skimmed quick. I didn't realize Databricks was not only stuck on 3.8, but stuck on 3.8.0!
rich-london-74860
03/04/2023, 2:37 AMpex-cli
configuration, but it doesn’t seem to make a differenceenough-analyst-54434
03/04/2023, 2:39 AMcp38-cp38-manylinux_2_27_x86_64.3ec80d51ce26438a86c97b71d562e96e
) of the racing process is visible when the failing process goes to collect its wheel. That code naively assumes the directory will be empty save for the built wheel - not taking into account racing process sibling workdirs. I'll file an issue here shortly and get out a fix.
Thanks for finding this one. Its very old - goes back to ~2018 fall; so I'm surprised no one has hit this yet!pants.toml
with:
[pex-cli]
version = "v2.1.126"
known_versions = [
"v2.1.126|macos_arm64|3bfd60f037b2edd4149067266536e37b4c67263d0db681e492e6071cb1a9adda|4080751",
"v2.1.126|macos_x86_64|3bfd60f037b2edd4149067266536e37b4c67263d0db681e492e6071cb1a9adda|4080751",
"v2.1.126|linux_x86_64|3bfd60f037b2edd4149067266536e37b4c67263d0db681e492e6071cb1a9adda|4080751",
"v2.1.126|linux_arm64|3bfd60f037b2edd4149067266536e37b4c67263d0db681e492e6071cb1a9adda|4080751"
]
rich-london-74860
03/04/2023, 9:47 PM