Hello Pants community, first time poster here! My ...
# general
c
Hello Pants community, first time poster here! My team and I currently have 2 problems after switching to pants as our monorepo-everything-tool. Hope you can help us shed some light on our issues!
pants 2.7.2
python 3.8.10
poetry
1. Pathing on a new instance. mlm_training.py can't find dsutils under libs (see folder and our setup script below) 2. Dependency resolving with poetry - We have an issue where we have 2 different projects that use scikit-learn [0.23.1, 1.0.1]. - In each project folder we have pyproject.toml specifying this. - Pants either 1. ignores one of the versions or 2. if specified as a dependency in BUILD (for both projects) returns the message "could not resolve constraints" data-science | ├── libs │ |__dsutils │ ├── BUILD │ ├── clean.py │ ├── config.py │ ├── data.py │ ├── init.py │ ├── path.py │ ├── pyproject.toml │ ├── README.md │ ├── tests │ │ ├── test_clean.py │ │ ├── test_path.py │ │ └── test_tokenizer.py │ ├── tokenizer.py │ └── utils.py ├── projects │ |__language_model │ ├── BUILD │ ├── config.py │ ├── data │ ├── data.dvc │ ├── init.py │ ├── mlm_training.py │ ├── models │ ├── models.dvc │ ├── params.yaml │ ├── pyproject.toml │ ├── README.md │ └── utils.py ├── constraints.txt ├── mypy.ini ├── pants ├── pants.ci.toml ├── pants.toml ├── pyproject.toml ├── README.md Relevant contents of
pants.toml
Copy code
[source]
root_patterns = [
  '/libs',
  '/projects',
]
Contents of relevant
setup.sh
script (some steps before this):
Copy code
echo "Installing pip"
"${PIP}" install pip --upgrade
echo "Installing poetry"
"${PIP}" install poetry
# Install all our requirements.txt, and also any 3rdparty
# dependencies specified outside requirements.txt, e.g. via a
# handwritten python_requirement_library target.
echo "Poetry update"
poetry update
echo "Installing dependencies"
"${PIP}" install -r <(poetry export --dev --without-hashes) -r <(./pants dependencies --type=3rdparty ::)
echo "Remove lockfile"
rm poetry.lock
echo "Generating constraints.txt"
rm constraints.txt
pip uninstall pkg-resources==0.0.0 -y # Have tried commenting this out.
pip list --format freeze  >> constraints.txt
h
👋 Apologies that it will take longer than usual to reply due to the Thanksgiving holiday in the US, but we will get to this as soon as we can!
🦃 2
c
No worries, thank you for the fast reply 😁
h
Re #1: How is
mlm_training.py
importing from
dsutils
? And what Pants command are you running, and what error message are you seeing?
And re #2: Can you post the contents of your BUILD files?
And also, Pants command + error message?
c
In reply to #1 The issue arises when i run in my remote kubernetes pod with a
ubuntu-20.04-cuda
base image. However i suspect that this would also happen if a new developer joined our team, and tried the steps shown below. Below are the steps to reproduce the error.
Copy code
$ git clone <our repository>
$ cd <our root repository>
$ bash scripts/setup.sh
$ source .venv/bin/activate
$ cd projects/language_model
$ python mlm_training.py
with the error message
ModuleNotFoundError: No module named 'dsutils'
mlm_training.py
imports like this
Copy code
from dsutils.data import read_json
json_data = read_json(file_name)
language_model/BUILD
Copy code
poetry_requirements()

pyproject(name="pyproject")

python_library()
dsutils/BUILD
Copy code
poetry_requirements()

pyproject(name="pyproject")

python_library(
    sources=["**/*.py", "!tests/test_*.py"],
    dependencies=[
        'libs/dsutils:torch',
        'libs/dsutils:spacy',
        'libs/dsutils:pandas',
    ]
)

python_tests(
    name="tests",
    sources=["tests/test_*.py"],
    timeout=120,
)
scripts/setup.sh
Copy code
#!/usr/bin/env bash

set -euo pipefail

# You can change these constants.
PYTHON_BIN=python3.8
VIRTUALENV=.venv
PIP="${VIRTUALENV}/bin/pip"
CONSTRAINTS_FILE=constraints.txt

"${PYTHON_BIN}" -m venv "${VIRTUALENV}"

source .venv/bin/activate

echo "Installing pip"
"${PIP}" install pip --upgrade
echo "Installing poetry"
"${PIP}" install poetry
# Install all our requirements.txt, and also any 3rdparty
# dependencies specified outside requirements.txt, e.g. via a
# handwritten python_requirement_library target.
echo "Poetry update"
poetry update
echo "Installing dependencies"
"${PIP}" install -r <(poetry export --dev --without-hashes) -r <(./pants dependencies --type=3rdparty ::)
echo "Remove lockfile"
rm poetry.lock
echo "Generating constraints.txt"
rm constraints.txt
pip uninstall pkg-resources==0.0.0 -y
pip list --format freeze  >> constraints.txt
h
So re #1, I don't see Pants being invoked anywhere?
You're running
python mlm_training.py
directly?
If you do that you need to make sure your source roots are on the PYTHONPATH
E.g.,
PYTHONPATH=data-science/libs:data-science:projects python projects/language_model/mlm_training.py
But Pants is designed to take care of this sort of thing for you, via:
./pants run projects/language_model/mlm_training.py
But you'll need a
pex_binary
target in
projects/language_model/BUILD
with
mlm_training.py
as an entry point
./pants tailor
should generate these for you, assuming that
mlm_training.py
has an
if __name__ == "__main__":
block, so tailor can identify it as an entry point
c
mlm_training.py
has the
if __name__ == "__main__":
block I also expected pants to take care of this, and we are running the file directly yes, but i also tried
./pants tailor
And i could see that it created a
pex_binary
for
mlm_training.py
then
./pants run projects/language_model/mlm_training.py
But this resulted in pants not being able to see
config.py
which is a file in the language_model project.
ModuleNotFoundError: No module named 'config'
mlm_training.py
Copy code
from config import DATA_DIR, LM_DIR, MODELS_DIR
config.py
Copy code
from dsutils.path import get_file_path
I think you are right that it is the
PYTHONPATH
that is the issue. Should we add it manually in our setup scrip, or how would you say we should proceed?
I changed our setup script to include
Copy code
DS_PATH=$(pwd)
export PYTHONPATH="${DS_PATH}/libs":"${DS_PATH}/projects"
And this was a quick fix to our issue. However i would like to know if there is a way to do it with pants 🙂 To anyone else having troubles with this, remember to use
source script.sh
instead of
bash script.sh