helpful-lunch-92084
06/10/2021, 7:43 PM.cache/pex_root/installed_wheels/
However, with venv stuff, it appears that it pulls that cached wheel to a venv dir but the resulting symlinks are converted to empty directories in the spacy/data dir and then subsequent spacy.cli.link fails because it can’t overwrite a directory.
1. Is there a way to prevent a non-venv execution from writing those symlinks to the installed_wheels dir? E.g. the spacy package is pulled into a temp location from installed_wheels?
2. Is there a way to selectively disable venv test execution for specific targets. (It’s a less ideal situation but I was thinking about excluding those specific targets from venv execution mode to workaround this immediate problem)
3. Any other ideas?enough-analyst-54434
06/10/2021, 8:00 PMhelpful-lunch-92084
06/10/2021, 8:02 PMenough-analyst-54434
06/10/2021, 8:34 PM$ python -mpex "spacy[transformers,lookups]" pip setuptools wheel -o spacy.pex --venv -cspacy
Then:
$ ./spacy.pex download en_core_web_sm
...
Installing collected packages: en-core-web-sm
Successfully installed en-core-web-sm-3.0.0
✔ Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')
Then:
$ PEX_INTERPRETER=1 ./spacy.pex
Python 3.9.5 (default, May 24 2021, 12:50:35)
[GCC 11.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import spacy
>>> spacy.load('en_core_web_sm')
<spacy.lang.en.English object at 0x7f80a971ad00>
>>>
Clearly I'm not exercising the spacy code that creates the symlinks.enough-analyst-54434
06/10/2021, 8:38 PMenough-analyst-54434
06/10/2021, 8:43 PMhelpful-lunch-92084
06/10/2021, 8:51 PMhelpful-lunch-92084
06/10/2021, 8:52 PMenough-analyst-54434
06/10/2021, 9:01 PMhelpful-lunch-92084
06/10/2021, 9:09 PMhelpful-lunch-92084
06/10/2021, 9:10 PM$ ls -l /Users/nate/.cache/pex_root/installed_wheels/8931db6716ebfa82fc6520460521223fd2366c25/spacy-2.1.8-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl/spacy/data/
total 0
-rw-r--r-- 1 nate staff 0 Jun 10 16:54 __init__.py
lrwxr-xr-x 1 nate staff 154 Jun 10 17:04 de -> /Users/nate/.cache/pex_root/installed_wheels/7268368e25395cf6c42013c335a73814ed98aa6d/de_core_news_sm-2.1.0-py3-none-any.whl/de_core_news_sm
lrwxr-xr-x 1 nate staff 168 Jun 10 17:04 en -> /Users/nate/.cache/pex_root/installed_wheels/83896cf122478301e33e164bba3da275221928b9/en_core_web_sm_textcat-1.0.0-py3-none-any.whl/en_core_web_sm_textcat
lrwxr-xr-x 1 nate staff 154 Jun 10 17:04 es -> /Users/nate/.cache/pex_root/installed_wheels/334e0324356daf2d8a2723972d443dba75438fa5/es_core_news_sm-2.1.0-py3-none-any.whl/es_core_news_sm
lrwxr-xr-x 1 nate staff 154 Jun 10 17:04 fr -> /Users/nate/.cache/pex_root/installed_wheels/6be69bb4dd9dc9f15f408a93eeec71305c30992e/fr_core_news_sm-2.1.0-py3-none-any.whl/fr_core_news_sm
lrwxr-xr-x 1 nate staff 154 Jun 10 17:04 it -> /Users/nate/.cache/pex_root/installed_wheels/22a7ffc29f77c01df0e180b684656ff053cc348a/it_core_news_sm-2.1.0-py3-none-any.whl/it_core_news_sm
lrwxr-xr-x 1 nate staff 154 Jun 10 17:04 nl -> /Users/nate/.cache/pex_root/installed_wheels/ae15833bb235f37d29cfa58b4def63a25aa41e38/nl_core_news_sm-2.1.0-py3-none-any.whl/nl_core_news_sm
lrwxr-xr-x 1 nate staff 154 Jun 10 17:04 pt -> /Users/nate/.cache/pex_root/installed_wheels/87b055365ff31407a833da909478cfe8c3d584cd/pt_core_news_sm-2.1.0-py3-none-any.whl/pt_core_news_sm
lrwxr-xr-x 1 nate staff 152 Jun 10 17:04 xx -> /Users/nate/.cache/pex_root/installed_wheels/0c6b6d94cf903727069b8dfcbc23f76d1a89a772/xx_ent_wiki_sm-2.1.0-py3-none-any.whl/xx_ent_wiki_sm
helpful-lunch-92084
06/10/2021, 9:11 PM$ ls -l /Users/nate/.cache/pex_root/venvs/d8e45253502f755be2885b7881abc87f86046363/93d314ee448ec5f1466875cd37c0ab55bda4655a/lib/python3.6/site-packages/spacy/data/
total 0
-rw-r--r-- 2 nate staff 0 Jun 10 16:54 __init__.py
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 de
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 en
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 es
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 fr
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 it
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 nl
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 pt
drwxr-xr-x 2 nate staff 64 Jun 10 17:06 xx
enough-analyst-54434
06/10/2021, 9:27 PM... after running spacy link:Ok, the spacy 3.0.6 CLI has no
spacy link
. I'm guessing spacy 2.1.8 does? About to find out.helpful-lunch-92084
06/10/2021, 9:29 PMenough-analyst-54434
06/10/2021, 9:29 PMhelpful-lunch-92084
06/10/2021, 9:30 PMhelpful-lunch-92084
06/10/2021, 9:31 PMspacy.cli.link("es_core_news_sm", "es", force=True)
enough-analyst-54434
06/10/2021, 9:31 PMhelpful-lunch-92084
06/10/2021, 9:32 PMes_core_news_sm
is a package on our sys.path, not sure if spacy download does the same thing (we happened to bundle that into a wheel awhile back when pex was zipapp only)enough-analyst-54434
06/10/2021, 9:33 PMdownload
just uses pip to download a dist and place it in the venv site-packages.helpful-lunch-92084
06/10/2021, 9:33 PMenough-analyst-54434
06/10/2021, 9:40 PMenough-analyst-54434
06/10/2021, 9:57 PM$ curl -sSL <https://github.com/explosion/spacy-models/releases/download/it_core_news_sm-2.1.0/it_core_news_sm-2.1.0.tar.gz> -O
$ pip wheel --no-deps it_core_news_sm-2.1.0.tar.gz
Processing ./it_core_news_sm-2.1.0.tar.gz
File was already downloaded /home/jsirois/Downloads/it_core_news_sm-2.1.0.tar.gz
Building wheels for collected packages: it-core-news-sm
Building wheel for it-core-news-sm (setup.py) ... done
Created wheel for it-core-news-sm: filename=it_core_news_sm-2.1.0-py3-none-any.whl size=11123295 sha256=4b6007baa80a7ba020c1a09a169192e05bc97383d0922e080d13140bf9f4029f
Stored in directory: /home/jsirois/.cache/pip/wheels/1b/b5/49/5970302bebb331f699e409a6eb0c3fb8aa8ceebbc30be5cb4e
Successfully built it-core-news-sm
2. Build the --venv PEX:
jsirois@gill ~/dev/pantsbuild/pex ((v2.1.35)) $ python3.6 -mpex "spacy==2.1.8" setuptools ~/Downloads/it_core_news_sm-2.1.0-py3-none-any.whl -o spacy-2.1.8.pex-2.1.35.venv --venv
3. Run it:
$ PEX_ROOT=/tmp/spacy-2.1.8.pex-2.1.35.venv ./spacy-2.1.8.pex-2.1.35.venv
Python 3.6.13 (default, Feb 16 2021, 20:57:41)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import spacy.cli.link
>>> spacy.cli.link("it_core_news_sm", "it", force=True)
✔ Linking successful
/tmp/spacy-2.1.8.pex-2.1.35.venv/venvs/short/86d8d0fa/lib/python3.6/site-packages/it_core_news_sm
-->
/tmp/spacy-2.1.8.pex-2.1.35.venv/venvs/short/86d8d0fa/lib/python3.6/site-packages/spacy/data/it
You can now load the model via spacy.load('it')
>>> spacy.load('it')
<spacy.lang.it.Italian object at 0x7f0bb92beac8>
>>>
now exiting InteractiveConsole...
4. Check it:
^jsirois@gill ~/dev/pantsbuild/pex ((v2.1.35)) $ ls -l /tmp/spacy-2.1.8.pex-2.1.35.venv/venvs/short/86d8d0fa/lib/python3.6/site-packages/spacy/data/
total 0
-rw-r--r-- 2 jsirois jsirois 0 Jun 10 14:51 __init__.py
lrwxrwxrwx 1 jsirois jsirois 97 Jun 10 14:51 it -> /tmp/spacy-2.1.8.pex-2.1.35.venv/venvs/short/86d8d0fa/lib/python3.6/site-packages/it_core_news_sm
enough-analyst-54434
06/10/2021, 9:58 PMhelpful-lunch-92084
06/10/2021, 10:00 PMenough-analyst-54434
06/10/2021, 10:02 PMhelpful-lunch-92084
06/10/2021, 10:02 PMenough-analyst-54434
06/10/2021, 10:06 PM$ PEX_ROOT=/tmp/spacy-2.1.8.pex-2.1.35 ./spacy-2.1.8.pex-2.1.35
Python 3.6.13 (default, Feb 16 2021, 20:57:41)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import spacy.cli.link
>>> spacy.cli.link("it_core_news_sm", "it", force=True)
✔ Linking successful
/tmp/spacy-2.1.8.pex-2.1.35/installed_wheels/106e0fcb67b8a740d3fa416bbc6cdc09b36e5c09/it_core_news_sm-2.1.0-py3-none-any.whl/it_core_news_sm
-->
/tmp/spacy-2.1.8.pex-2.1.35/installed_wheels/2aeeab03e5348116d101c8d9a67a30e2671dfe59/spacy-2.1.8-cp36-cp36m-manylinux1_x86_64.whl/spacy/data/it
You can now load the model via spacy.load('it')
>>> spacy.load('it')
<spacy.lang.it.Italian object at 0x7f86efa65518>
>>>
now exiting InteractiveConsole...
Then VENV:
$ PEX_ROOT=/tmp/spacy-2.1.8.pex-2.1.35 ./spacy-2.1.8.pex-2.1.35.venv
Python 3.6.13 (default, Feb 16 2021, 20:57:41)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import spacy.cli.link
>>> spacy.cli.link("it_core_news_sm", "it", force=True)
✘ Can't overwrite symlink 'it'
This can happen if your data directory contains a directory or file of the same
name.
helpful-lunch-92084
06/10/2021, 10:07 PMenough-analyst-54434
06/10/2021, 10:11 PMenough-analyst-54434
06/10/2021, 10:14 PMforce=True
seem to imply that spacy should not complain like it does. I wonder if I'm reading that wrong or its a bug fixed in later versions of spacy?helpful-lunch-92084
06/10/2021, 10:14 PMhelpful-lunch-92084
06/10/2021, 10:15 PMenough-analyst-54434
06/10/2021, 10:15 PMenough-analyst-54434
06/10/2021, 10:15 PMhelpful-lunch-92084
06/10/2021, 10:16 PMhelpful-lunch-92084
06/10/2021, 10:17 PM>>> spacy.cli.link("en_core_web_sm", "en")
⚠ As of spaCy v3.0, model symlinks are not supported anymore. You can
load trained pipeline packages using their full names or from a directory
path.
enough-analyst-54434
06/10/2021, 10:19 PMhelpful-lunch-92084
06/10/2021, 10:19 PMenough-analyst-54434
06/10/2021, 10:20 PMhelpful-lunch-92084
06/10/2021, 10:20 PMenough-analyst-54434
06/10/2021, 10:24 PMpex_binary
.enough-analyst-54434
06/10/2021, 10:25 PMpython_tests
too.helpful-lunch-92084
06/10/2021, 10:35 PMenough-analyst-54434
06/10/2021, 10:45 PMpex
now you get pex-tools
in addition to pex
. You can also build a PEX with tools via --include-tools
. Then run the resulting PEX file with PEX_TOOLS=1 ./my.pex
One of those tools is venv
which creates a venv from your PEX file at the directory you specify. In that case, you really never want symlinks since someone could then nuke the Pex cache and kill your venv out over here elsewhere. Its only for --venv
execution mode, which runs the pex venv tool implicitly placing then venv inside the Pex cache (~/pex/venvs/...) that it would actually OK to preserve symlinks.enough-analyst-54434
06/10/2021, 10:46 PMenough-analyst-54434
06/10/2021, 10:48 PMenough-analyst-54434
06/10/2021, 10:49 PM--symlink
mode for venv creation could solve this case ... but only if we made it the default / Pants always used it or we did the same ugly plumbing mentioned above out to pex_binary
and python_tests
targets,helpful-lunch-92084
06/10/2021, 10:59 PM