nice-florist-55958
02/17/2022, 11:58 PM18:30:01 SHA256 fingerprint of file:///var/tmp/pjenkslv/jenkins/workspace/casper-codetree-release-lib-ANY/casper/codetree/codetree/src/pantsbuild/pex verified.
18:30:01 Preparinng bootstrap with initial requirements
18:30:01 /ms/dist/python/PROJ/core/3.7.5/exec/bin/python: can't find '__main__' module in '/var/tmp/emmdev/.pex/unzipped_pexes/478cc1fa371ca40aa3e7dafee735ca438d4a243f'
The specific line is:
"${python}" "${pex_path}" --cert="${PIP_CERT}" -r requirements.txt -c virtualenv -o virtualenv.pex --retries=1 --timeout=1 --index-url="${PIP_INDEX_URL}"
The extra arguments are our own to make downloading packages work, but this is in the boostrap_virtualenv function, and it has worked fine until just recently.
It's very hard to diagnose problems on the CI host, so at the moment I can't tell if this is Pants/Pex related or some problem with the host...enough-analyst-54434
02/18/2022, 12:37 AMpython .
and I have __main__.py
in the current directory, it runs. If I rename it to __not_main__.py
I get your error:
$ python .
Hello!
$ mv __main__.py __not_main__.py
$ python .
/home/jsirois/.pyenv/versions/3.10.2/bin/python: can't find '__main__' module in '/tmp/test/.'
$
Is it true that /var/tmp/emmdev
is the CI user's home dir and that it is in-fact temporary and re-populated on each CI run?nice-florist-55958
02/18/2022, 4:06 AM22:56:37 SHA256 fingerprint of file:///var/tmp/pjenkslv/jenkins/workspace/casper-codetree-release-lib-ANY/casper/codetree/codetree/src/pantsbuild/pex verified.
22:56:37 Preparinng bootstrap with initial requirements
22:56:37 PEX PATH: /var/tmp/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/pex-2.1.42/pex
22:56:37 /var/tmp/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/pex-2.1.42/pex
22:57:48 /var/tmp/emmdev/.pex/unzipped_pexes/478cc1fa371ca40aa3e7dafee735ca438d4a243f/.deps/pex-2.1.42-py2.py3-none-any.whl/pex/tools/commands/venv.py:141: PEXWarning: Encountered collision building venv at /var/tmp/emmdev/.pex/venvs/short/c9600bda from /var/tmp/emmdev/.pex/pip.pex/46820cb5af0dcf9295a4e7f30184cc0e9fa063dc:
22:57:48 1. /var/tmp/emmdev/.pex/venvs/720739c5b08326cc23c9ac0b68c11307ad60aca3/1fd650467e13c9fc5e0f7b7915a685aa6aec963f.02df45dac6084a0a97a4934629166fe7/lib/python3.7/site-packages/constraints.txt was provided by:
22:57:48 /var/tmp/emmdev/.pex/pip.pex/46820cb5af0dcf9295a4e7f30184cc0e9fa063dc/.deps/setuptools/constraints.txt
22:57:48 /var/tmp/emmdev/.pex/pip.pex/46820cb5af0dcf9295a4e7f30184cc0e9fa063dc/.deps/wheel/constraints.txt
22:57:48 pex_warnings.warn(message)
22:57:48 Installing pantsbuild.pants==2.9.0 into a virtual environment at /var/tmp/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37
22:57:50 created virtual environment CPython3.7.5.final.0-64 in 189ms
22:57:50 creator CPython3Posix(dest=/var/tmp/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/pants.BVOx5n/install, clear=False, no_vcs_ignore=False, global=False)
22:57:50 seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/var/tmp/emmdev/.local/share/virtualenv)
22:57:50 added seed packages: pip==21.1.2, setuptools==57.0.0, wheel==0.36.2
22:57:50 activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
22:57:50 ./pants: line 364: /var/tmp/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/pants.BVOx5n/install/bin/pip: No such file or directory
I don't know if the PEX warning is a useful clue or not? But _bootstrap_pants_ gets called again a few times after each failure and then the script finally gives up.enough-analyst-54434
02/18/2022, 4:42 AM__main__.py
not being there bin/pip
is not there. Both these files should be there / we've never seen this to my knowledge. That's why I pushed on the tmp cleaner angle. If a tmp cleaner CRON job happens by in the middle of a CI run and your HOME is in /tmp, you're in for this sort of trouble. It would be good to hard rule a tmp cleaner in or out here before proceeding further.__main__.py
was not found that was due to a PEX zip >2GB in size. Does that pertain to your PEXes? I don't suspect this is it though since you say things work locally just not in CI. The 2GB issue should be universal./var/tmp/emmdev/.pex/unzipped_pexes/478cc1fa371ca40aa3e7dafee735ca438d4a243f
could never be created from the original too-big PEX file.nice-florist-55958
02/24/2022, 1:02 AM./pants version
returned the expected result too.
BUT! When it came time for ./pants package ::
bad things happened. I added some echo
and ls
statements in the pants script just before the pex execution line to debug. Seems Pants can in fact see the file its complaining about before the exec line, but not after (maybe in some subprocess?). But you can see below the ls -la
shows emmdev created that file and that it has execute permissions. But the job ultimately fails for not seeing this file as reported in ProcessExecutionFailure error. I don't know what the significance of a fully resolved NFS file location is (with the server and all) versus the alias, if any, but the latter is the one it attempts to lookup.
19:40:07 /v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37
19:40:07 /v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37/bin/python
19:40:07 /v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37/bin/pants
19:40:07 /v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37/bin/python /v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37/bin/pants --pants-bin-name=./pants --pants-version=2.9.0 package ./proj/libs/clients/**
19:40:07 lrwxrwxrwx 1 emmdev ir_share 11 Feb 23 19:01 /a/stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex -> __main__.py
19:40:07 lrwxrwxrwx 1 emmdev ir_share 11 Feb 23 19:01 /v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex -> __main__.py
19:40:07 19:00:15.11 [INFO] Starting: Building build_backend.pex from setuptools_default_lockfile.txt
19:41:31 19:01:31.87 [INFO] Long running tasks:
19:41:31 76.76s Building build_backend.pex from setuptools_default_lockfile.txt
19:41:36 19:01:36.06 [INFO] Completed: Building build_backend.pex from setuptools_default_lockfile.txt
19:41:36 19:01:36.14 [ERROR] 1 Exception encountered:
19:41:36
19:41:36 ProcessExecutionFailure: Process 'Building build_backend.pex from setuptools_default_lockfile.txt' failed with exit code 1.
19:41:36 stdout:
19:41:36
19:41:36 stderr:
19:41:36 Failed to spawn a job for DistributionTarget(interpreter=PythonInterpreter('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', PythonIdentity('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', 'cp37', 'cp37m', 'manylinux_2_17_x86_64', (3, 7, 5)))): [Errno 2] No such file or directory: '/a/stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex': '/a/stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex'
19:41:36
19:41:36
19:41:36
19:41:36 Use `--no-process-cleanup` to preserve process chroots for inspection.
enough-analyst-54434
02/24/2022, 1:26 AMnice-florist-55958
02/24/2022, 1:41 AMenough-analyst-54434
02/24/2022, 2:17 AMnice-florist-55958
02/24/2022, 2:33 AMenough-analyst-54434
02/24/2022, 2:47 AMlockd
must be running (apparently you can run NFS without one.nice-florist-55958
02/24/2022, 4:23 AMlockd
daemon is always running based on experience with poorly written programs dying when someone has the CSV file open they want to update! 🙂
As for the protocol, nfsstat -c
is showing 8mm getattr requests on NFS3 and merely 50k on NFS2 for one of the hosts I am using. Testing on newer hosts that support NFS4, virtually all the file system operations are still NFS3. And rpcinfo -p <http://stor118ncs2.new-york.ms.com|stor118ncs2.new-york.ms.com>
is saying this storage server only supports NFS3 and 4, so I think it's safe to say NFS3 is being used, but I will try to confirm that is really the case tomorrow, and also netstat -c
the CI server as it might be using an older NFS protocol.enough-analyst-54434
02/24/2022, 4:47 AMnice-florist-55958
02/24/2022, 4:56 PM09:58:23 09:58:22.22 [ERROR] 1 Exception encountered:
09:58:23
09:58:23 Exception: Snapshot failed: Error storing Digest { hash: Fingerprint<827b11f107d720685bb9b013fafc76f126779197f63f9a831300554e714949ec>, size_bytes: 77 }: MDB_CURSOR_FULL: Internal error - cursor stack limit reached
enough-analyst-54434
02/24/2022, 5:09 PMLet me see if an NFS4 configuration is possible. It seems that would be a resolution as no extra protocol is needed?Perhaps, but not a solution if the HA guess is right. For that class of problems, no matter the protocol used by clients, the multi-homed server / Byzantine Generals problem becomes the issue.
I had also realized I’ve been using NFS “locally” all this time w/ Pants and never had these issues. ... and the rest ...Excellent! That's great information to have.
Unfortunately, we now have a new error. On any subsequent run I’m getting:Now that's a cool one. This too may or may not be related to NFS but its new to any of us as far as I know and worth an issue to track!
nice-florist-55958
02/24/2022, 5:38 PMenough-analyst-54434
02/24/2022, 5:42 PMMDB_CURSOR_FULL
is repeatable and if it goes away after killing the pantsd
daemon and re-running.nice-florist-55958
02/24/2022, 6:05 PMenough-analyst-54434
02/24/2022, 7:23 PMnice-florist-55958
02/24/2022, 7:53 PM14:37:15 14:37:15.04 [DEBUG] Completed: `package` goal
14:37:15 14:37:15.04 [DEBUG] computed 1 nodes in 33.880423 seconds. there are 3957 total nodes.
14:37:15 14:37:15.04 [ERROR] 1 Exception encountered:
14:37:15
14:37:15 Engine traceback:
14:37:15 in select
14:37:15 in pants.core.goals.package.package_asset
14:37:15 in pants.backend.python.goals.setup_py.package_python_dist (proj/libs/clients:clients)
14:37:15 in pants.backend.python.util_rules.dists.run_pep517_build
14:37:15 in pants.backend.python.util_rules.pex.create_venv_pex (build_backend.pex)
14:37:15 in pants.backend.python.util_rules.pex.build_pex (build_backend.pex)
14:37:15 in pants.engine.process.fallible_to_exec_result_or_raise
14:37:15 Traceback (most recent call last):
14:37:15 File "/v/campus/ny/cs/casper/taymarti/emmdev/.cache/pants/setup/bootstrap-Linux-x86_64/2.9.0_py37/lib/python3.7/site-packages/pants/engine/process.py", line 287, in fallible_to_exec_result_or_raise
14:37:15 process_cleanup=process_cleanup.val,
14:37:15 pants.engine.process.ProcessExecutionFailure: Process 'Building build_backend.pex from setuptools_default_lockfile.txt' failed with exit code 1.
14:37:15 stdout:
14:37:15
14:37:15 stderr:
14:37:15 pex: Resolving interpreters
14:37:15 pex: Resolving interpreters: 0.5ms
14:37:15 pex: Building pex
14:37:15 pex: Building pex :: Resolving distributions (setuptools_default_lockfile.txt)
14:37:15 pex: Building pex :: Resolving distributions (setuptools_default_lockfile.txt) :: Resolving requirements.
14:37:15 pex: Building pex :: Resolving distributions (setuptools_default_lockfile.txt) :: Resolving requirements. :: Resolving for:
14:37:15 DistributionTarget(interpreter=PythonInterpreter('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', PythonIdentity('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', 'cp37', 'cp37m', 'manylinux_2_17_x86_64', (3, 7, 5))))
14:37:15 pex: Spawning a maximum of 4 parallel jobs to process:
14:37:15 DistributionTarget(interpreter=PythonInterpreter('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', PythonIdentity('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', 'cp37', 'cp37m', 'manylinux_2_17_x86_64', (3, 7, 5))))
14:37:15 pex: Hashing pex
14:37:15 pex: Hashing pex: 14657.6ms
14:37:15 pex: Isolating pex
14:37:15 pex: Isolating pex: 21.4ms
14:37:15 Failed to spawn a job for DistributionTarget(interpreter=PythonInterpreter('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', PythonIdentity('/ms/dist/python/PROJ/core/3.7.5-0/.exec/@sys/bin/python3.7', 'cp37', 'cp37m', 'manylinux_2_17_x86_64', (3, 7, 5)))): [Errno 2] No such file or directory: '/a/<http://stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex':|stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex':> '/a/<http://stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex'|stor118ncs2.new-york.ms.com/sc25317/s122015/taymarti/emmdev/.cache/pants/named_caches/pex_root/venvs/1d932cd3e82057d61e57c365b177aad9b535724c/9a128dacefb3843fa45de2c0dc225c7ee1cb4d0e/pex'>
14:37:15
14:37:15
14:37:15
14:37:15 Use `--no-process-cleanup` to preserve process chroots for inspection.
enough-analyst-54434
02/24/2022, 8:10 PM