curved-wall-59116
11/25/2021, 2:21 PMpants 2.7.2
python 3.8.10
poetry
1. Pathing on a new instance. mlm_training.py can't find dsutils under libs (see folder and our setup script below)
2. Dependency resolving with poetry
- We have an issue where we have 2 different projects that use scikit-learn [0.23.1, 1.0.1].
- In each project folder we have pyproject.toml specifying this.
- Pants either 1. ignores one of the versions or 2. if specified as a dependency in BUILD (for both projects) returns the message "could not resolve constraints"
data-science
|
├── libs
│ |__dsutils
│ ├── BUILD
│ ├── clean.py
│ ├── config.py
│ ├── data.py
│ ├── init.py
│ ├── path.py
│ ├── pyproject.toml
│ ├── README.md
│ ├── tests
│ │ ├── test_clean.py
│ │ ├── test_path.py
│ │ └── test_tokenizer.py
│ ├── tokenizer.py
│ └── utils.py
├── projects
│ |__language_model
│ ├── BUILD
│ ├── config.py
│ ├── data
│ ├── data.dvc
│ ├── init.py
│ ├── mlm_training.py
│ ├── models
│ ├── models.dvc
│ ├── params.yaml
│ ├── pyproject.toml
│ ├── README.md
│ └── utils.py
├── constraints.txt
├── mypy.ini
├── pants
├── pants.ci.toml
├── pants.toml
├── pyproject.toml
├── README.md
Relevant contents of pants.toml
[source]
root_patterns = [
'/libs',
'/projects',
]
Contents of relevant setup.sh
script (some steps before this):
echo "Installing pip"
"${PIP}" install pip --upgrade
echo "Installing poetry"
"${PIP}" install poetry
# Install all our requirements.txt, and also any 3rdparty
# dependencies specified outside requirements.txt, e.g. via a
# handwritten python_requirement_library target.
echo "Poetry update"
poetry update
echo "Installing dependencies"
"${PIP}" install -r <(poetry export --dev --without-hashes) -r <(./pants dependencies --type=3rdparty ::)
echo "Remove lockfile"
rm poetry.lock
echo "Generating constraints.txt"
rm constraints.txt
pip uninstall pkg-resources==0.0.0 -y # Have tried commenting this out.
pip list --format freeze >> constraints.txt
happy-kitchen-89482
11/25/2021, 2:24 PMcurved-wall-59116
11/25/2021, 2:43 PMhappy-kitchen-89482
11/25/2021, 2:59 PMmlm_training.py
importing from dsutils
? And what Pants command are you running, and what error message are you seeing?happy-kitchen-89482
11/25/2021, 3:00 PMhappy-kitchen-89482
11/25/2021, 3:00 PMcurved-wall-59116
11/25/2021, 3:07 PMubuntu-20.04-cuda
base image.
However i suspect that this would also happen if a new developer joined our team, and tried the steps shown below.
Below are the steps to reproduce the error.
$ git clone <our repository>
$ cd <our root repository>
$ bash scripts/setup.sh
$ source .venv/bin/activate
$ cd projects/language_model
$ python mlm_training.py
with the error message
ModuleNotFoundError: No module named 'dsutils'
mlm_training.py
imports like this
from dsutils.data import read_json
json_data = read_json(file_name)
language_model/BUILD
poetry_requirements()
pyproject(name="pyproject")
python_library()
dsutils/BUILD
poetry_requirements()
pyproject(name="pyproject")
python_library(
sources=["**/*.py", "!tests/test_*.py"],
dependencies=[
'libs/dsutils:torch',
'libs/dsutils:spacy',
'libs/dsutils:pandas',
]
)
python_tests(
name="tests",
sources=["tests/test_*.py"],
timeout=120,
)
scripts/setup.sh
#!/usr/bin/env bash
set -euo pipefail
# You can change these constants.
PYTHON_BIN=python3.8
VIRTUALENV=.venv
PIP="${VIRTUALENV}/bin/pip"
CONSTRAINTS_FILE=constraints.txt
"${PYTHON_BIN}" -m venv "${VIRTUALENV}"
source .venv/bin/activate
echo "Installing pip"
"${PIP}" install pip --upgrade
echo "Installing poetry"
"${PIP}" install poetry
# Install all our requirements.txt, and also any 3rdparty
# dependencies specified outside requirements.txt, e.g. via a
# handwritten python_requirement_library target.
echo "Poetry update"
poetry update
echo "Installing dependencies"
"${PIP}" install -r <(poetry export --dev --without-hashes) -r <(./pants dependencies --type=3rdparty ::)
echo "Remove lockfile"
rm poetry.lock
echo "Generating constraints.txt"
rm constraints.txt
pip uninstall pkg-resources==0.0.0 -y
pip list --format freeze >> constraints.txt
happy-kitchen-89482
11/25/2021, 8:03 PMhappy-kitchen-89482
11/25/2021, 8:04 PMpython mlm_training.py
directly?happy-kitchen-89482
11/25/2021, 8:04 PMhappy-kitchen-89482
11/25/2021, 8:05 PMPYTHONPATH=data-science/libs:data-science:projects python projects/language_model/mlm_training.py
happy-kitchen-89482
11/25/2021, 8:05 PM./pants run projects/language_model/mlm_training.py
happy-kitchen-89482
11/25/2021, 8:06 PMpex_binary
target in projects/language_model/BUILD
with mlm_training.py
as an entry pointhappy-kitchen-89482
11/25/2021, 8:07 PMhappy-kitchen-89482
11/25/2021, 8:07 PM./pants tailor
should generate these for you, assuming that mlm_training.py
has an if __name__ == "__main__":
block, so tailor can identify it as an entry pointcurved-wall-59116
11/26/2021, 9:50 AMmlm_training.py
has the if __name__ == "__main__":
block
I also expected pants to take care of this, and we are running the file directly yes, but i also tried
./pants tailor
And i could see that it created a pex_binary
for mlm_training.py
then
./pants run projects/language_model/mlm_training.py
But this resulted in pants not being able to see config.py
which is a file in the language_model project.
ModuleNotFoundError: No module named 'config'
mlm_training.py
from config import DATA_DIR, LM_DIR, MODELS_DIR
config.py
from dsutils.path import get_file_path
I think you are right that it is the PYTHONPATH
that is the issue. Should we add it manually in our setup scrip, or how would you say we should proceed?curved-wall-59116
11/26/2021, 10:46 AMDS_PATH=$(pwd)
export PYTHONPATH="${DS_PATH}/libs":"${DS_PATH}/projects"
And this was a quick fix to our issue. However i would like to know if there is a way to do it with pants 🙂
To anyone else having troubles with this, remember to use source script.sh
instead of bash script.sh