ambitious-petabyte-59095
12/21/2021, 3:10 PM# src/python/projectA/BUILD
python_sources(
dependencies=[
"//:nltk",
]
)
python_distribution(
name="wheel",
dependencies=[":projectA"],
provides=setup_py(
name="projectA",
),
wheel=True,
)
Can someone help point me to the right direction? Thanks so much.hundreds-father-404
12/21/2021, 3:15 PMambitious-petabyte-59095
12/21/2021, 3:31 PMimport nltk
nltk.download("stopwords")
but I've noticed that with this sometimes the stopwords are not found, maybe due to incomplete download or another issue.
Example of the error:
raise LookupError(resource_not_found)
E LookupError:
E **********************************************************************
E Resource stopwords not found.
E Please use the NLTK Downloader to obtain the resource:
E
E >>> import nltk
E >>> nltk.download('stopwords')
E
E For more information see: <https://www.nltk.org/data.html>
E
E Attempted to load corpora/stopwords
E
E Searched in:
E - '/root/nltk_data'
E - '/root/.cache/pants/named_caches/pex_root/venvs/s/7eec7f58/venv/nltk_data'
E - '/root/.cache/pants/named_caches/pex_root/venvs/s/7eec7f58/venv/share/nltk_data'
E - '/root/.cache/pants/named_caches/pex_root/venvs/s/7eec7f58/venv/lib/nltk_data'
E - '/usr/share/nltk_data'
E - '/usr/local/share/nltk_data'
E - '/usr/lib/nltk_data'
E - '/usr/local/lib/nltk_data'
I updated my original post with the type of solution I was wondering about and if it would be possible through pants (or if you have any other suggestions, I could very well be over complicating this haha).happy-kitchen-89482
12/21/2021, 3:37 PMimport nltk
nltk.download("stopwords")
in your code?enough-analyst-54434
12/21/2021, 4:09 PMpip install
s it later, they don't need to download that data themselves or else the code in the python distribution doesn't need to do it for them just-in-time before proceeding to use it?files
or resources
target to depend on it?ambitious-petabyte-59095
12/21/2021, 4:48 PMenough-analyst-54434
12/21/2021, 4:58 PMpython_distribution
target uses an auto-generated one (by Pants). You'd need to use `generate_setup=False`(https://www.pantsbuild.org/docs/reference-python_distribution#codegenerate_setupcode) to start with and write up your own setup.py
. If you go that route, there will be a fair bit of tinkering needed. I'd suggest eliminating Pants complicating factors by 1st writing a minimal setup.py
that pretty much just created a distribution that successfully contained the data files as well as one console script entrypoint that could proved the data included in the distribution was loadable at runtime. Only after you got that worked out would I return to integrating it with Pants.happy-kitchen-89482
12/21/2021, 9:00 PMpyproject.toml
and setup.py
pyproject.toml
would have
[build-system]
requires = ["setuptools==X", "nltk==Y"] # Fill in versions
nltk
requirement in at setup.py runtime