bumpy-noon-80834
04/02/2024, 3:23 PMgraphql-models:
find ../../graphql -type f -exec cat {} \; \
| datamodel-codegen ...
I am trying to use Pants "the right way", so I tried with `run_shell_command`:
python_requirement(
name = "datamodel-codegenerator",
requirements = ["datamodel-code-generator[graphql]>=0.25.5"],
)
shell_command(
name = "generate_graphql_models",
command = "bash ./scripts/generate-models.sh",
execution_dependencies = [
"graphql:schema",
"./scripts:generate_graphql_models_source",
":datamodel-codegenerator",
],
output_files = ["models.py"],
tools = [
"bash",
"cat",
"find",
"python3",
],
)
However, when running pants export-codegen python/data:generate_graphql_models
, the generated file is stored in dist/codegen/...
. Is there a way to "move" it to "python/data/src/...." ?
I also tried using adhoc_tool
, which looks more adapted to what I achieve to achieve:
python_requirement(
name = "datamodel-codegenerator",
requirements = ["datamodel-code-generator[graphql]>=0.25.5"],
)
system_binary(
name = "bash",
binary_name = "bash",
)
system_binary(
name = "python3",
binary_name = "python3",
)
system_binary(
name = "cat",
binary_name = "cat",
)
system_binary(
name = "find",
binary_name = "find",
)
adhoc_tool(
name = "generate_graphql_models",
args = ["./scripts/generate-models.sh"],
execution_dependencies = [
"graphql:schema",
"./scripts:generate_graphql_models_source",
":datamodel-codegenerator",
],
output_files = ["src/data/models.py"],
runnable = ":bash",
runnable_dependencies = [
":python3",
":cat",
":find",
],
)
But for some reason I can't get the Python dependency I need:
$ pants export-codegen python/data:generate_graphql_models
23:18:32.26 [INFO] Completed: Running the `adhoc_tool` at python/data:generate_graphql_models
23:18:32.26 [ERROR] 1 Exception encountered:
Engine traceback:
in `export-codegen` goal
ProcessExecutionFailure: Process 'the `adhoc_tool` at python/data:generate_graphql_models' failed with exit code 1.
stdout:
stderr:
/usr/bin/python3: No module named datamodel_code_generator
/usr/bin/find: 'cat' terminated by signal 13
/usr/bin/find: 'cat' terminated by signal 13
/usr/bin/find: 'cat' terminated by signal 13
/usr/bin/find: 'cat' terminated by signal 13
/usr/bin/find: 'cat' terminated by signal 13
/usr/bin/find: 'cat' terminated by signal 13
Use `--keep-sandboxes=on_failure` to preserve the process chroot for inspection.
What am I missing? Any help appreciated! 🙂gorgeous-winter-99296
04/02/2024, 3:33 PM--keep-sandboxes=on_failure
and then inspect the sandbox and debug it in-place. Without seeing generate-models.sh
it does seem like an issue with that script and how it invokes the datamodel-codegenerator
. I'd imagine you need to build it as a pex_binary
and then run that.bumpy-noon-80834
04/03/2024, 12:42 AM#!/bin/bash -eux
PYTHON_VERSION=$(python3 -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')
find ../../graphql -type f -exec cat {} \; |
python3 -m datamodel_code_generator \
--input-file-type graphql \
--output models.py \
--output-model-type pydantic_v2.BaseModel \
--use-union-operator \
--target-python-version "$PYTHON_VERSION" \
--use-schema-description \
--custom-file-header '# pyright: reportIncompatibleVariableOverride=false'
The issue occurs because depiste adding ":datamodel-codegenerator" to the execution_dependencies, the datamodel_code_generator package is not made available to the Python interpreter. Just surprised I do not have that issue when using shell_command, as I run the exact same script successfully with it.
BTW, what is the right way of generating my models.py file?bumpy-noon-80834
04/03/2024, 2:42 AMpython_requirement(
name="datamodel-code-generator",
requirements=["datamodel-code-generator[graphql]>=0.25.5"],
)
python_source(
name="generate_graphql_models_py",
dependencies=[":datamodel-code-generator"],
source="generate_graphql_models.py",
)
pex_binary(
name="generate_graphql_models",
dependencies=[
":generate_graphql_models_py",
"graphql:schema",
],
entry_point="generate_graphql_models.py",
)
generate_graphql_models.py:
from pathlib import Path
from datamodel_code_generator import (
DataModelType,
InputFileType,
PythonVersion,
generate,
)
def read_schema(schema_dir: Path) -> str:
if not schema_dir.is_dir():
raise Exception(f"Invalid schema_dir: {schema_dir}")
return "".join(file.read_text() for file in schema_dir.glob("**/*.graphql"))
def generate_pydantic_models(schema_str: str, models_path: Path) -> None:
generate(
input_=schema_str,
input_file_type=InputFileType.GraphQL,
output=models_path,
output_model_type=DataModelType.PydanticV2BaseModel,
use_union_operator=True,
use_schema_description=True,
target_python_version=PythonVersion.PY_312,
custom_file_header="# pyright: reportIncompatibleVariableOverride=false",
)
if __name__ == "__main__":
schema = read_schema(Path())
generate_pydantic_models(schema, Path("python/data/src/timequest_data/models.py"))
Now I can run pants run python/data:generate_graphql_models
and it generates the models.py
file in the right place.
It still doesn't feel like the right way to do this, but I guess the right way would be to write a codegen plugin for datamodel_code_generator
. So in the short term I'll probably settle for this solution.bumpy-noon-80834
04/03/2024, 10:02 AMfrom pathlib import Path
from datamodel_code_generator import (
DataModelType,
InputFileType,
PythonVersion,
generate,
)
def read_schema(schema_dir: Path) -> str:
if not schema_dir.is_dir():
raise Exception(f"Invalid schema_dir: {schema_dir}")
return "".join(file.read_text() for file in schema_dir.glob("**/*.graphql"))
def generate_pydantic_models(schema_str: str, models_path: Path) -> None:
generate(
input_=schema_str,
input_file_type=InputFileType.GraphQL,
output=models_path,
output_model_type=DataModelType.PydanticV2BaseModel,
use_union_operator=True,
use_schema_description=True,
target_python_version=PythonVersion.PY_312,
custom_file_header="# pyright: reportIncompatibleVariableOverride=false",
)
if __name__ == "__main__":
schema = read_schema(Path("../../../graphql"))
generate_pydantic_models(schema, Path("timequest_models.py"))
BUILD:
python_requirement(
name="datamodel-code-generator",
requirements=["datamodel-code-generator[graphql]>=0.25.5"],
)
python_source(
name="codegen_py",
dependencies=[
":datamodel-code-generator",
"graphql:schema",
],
source="codegen.py",
)
adhoc_tool(
name="codegen",
output_files=["timequest_models.py"],
runnable=":codegen_py",
)
experimental_wrap_as_python_sources(
name="models",
inputs=[":codegen"],
)
Then I can simply depend on the "models" target.