Hi, I have a question about how to package the con...
# general
r
Hi, I have a question about how to package the config/static files with a python distribution. Right now I add it as a
resources
dependency explicitly in
python_distribution
target, but it still doesn't get shipped with the source code. What is the way to achieve this?
e
One way to think of things to make more general sense of it all: Python libraries should be self contained in the root
python_sources
(used to be call python_library) target that defines the root of the library. The
python_distribution
target then just points to that
python_sources
target. The reason this makes sense is - presumably - the library should be useable without ever publishing it (you should be able to use the library locally in a repl session and have the library successfully load its resources). So a
python_distribution
target should just contain the extra metadata needed to publish a library and nothing more.
So, concretely, move your resources dependency to the
python_sources
target owning the files that actually load or use the resources.
h
Good reminder that we really should log a warning about this! @refined-addition-53644 would you mind filing an issue at https://www.github.com/pantsbuild/pants/issues ?
šŸ‘ 1
r
@enough-analyst-54434 I still am not able to package the static resources, even after providing all the static resources as explicit dependency when building
python_sources
This is how the python package looks like.
Copy code
pyfleet-vehicle-spec
ā”œā”€ā”€ BUILD
ā”œā”€ā”€ config/
ā”œā”€ā”€ database/
ā”œā”€ā”€ pyfleet_vehicle_spec/
ā”œā”€ā”€ pyproject.toml
ā”œā”€ā”€ README.md
ā”œā”€ā”€ resources/
ā”œā”€ā”€ tests/
The static resources are inside ā€¢ config/ ā€¢ database/ ā€¢ resources/ and the python source code is inside
pyfleet_vehicle_spec
. I have checked the generated tar by unzipping it, and it includes everything which is inside
pyfleet_vehicle_spec
but nothing from any other resource. This is how the BUILD file looks like
Copy code
resources(
    name="req_config",
    sources=["config/**/*.py", "config/**/*.yaml"]
)

resources(
    name="req_resources",
    sources=["resources/**/*.py", "resources/**/*.csv"]
)

resources(
    name="req_db",
    sources=["database/**/*.py", "database/**/*.db"]
)

resources(
    name="req_version",
    sources=["pyfleet_vehicle_spec/VERSION"]
)

python_sources(
    name="lib",
    sources=["pyfleet_vehicle_spec/**/*.py"],
    dependencies=[":req_config", ":req_db", ":req_resources", ":req_version"],
)

resources(
    name="test_resources",
    sources=["tests/resources/**/*.py", "tests/resources/**/*.csv"]
)

python_tests(
    name="tests",
    sources=["tests/**/test_*.py"],
    dependencies=[":test_resources"],
    runtime_package_dependencies=[":pyfleet-vehicle-spec-dist"])

# Needed together for building the distribution
python_distribution(
    name="pyfleet-vehicle-spec-dist",
    dependencies=[":lib"],
    wheel=True,
    sdist=True,
    provides=setup_py(
        name="pyfleet-vehicle-spec",
        description="Vehicle spec",
        include_package_data=True,
    ),
    generate_setup=True,
)
e
Thanks for the detailed data @refined-addition-53644. I repro and this exposes a bug in our calculation of package_data for the generated
setup.py
. We ensure the resource files live in a package. This idea is correct, but the implementation is bad since we only consider packages as those directories containing .py files owned by
python_sources
targets. This need not be the case as this example demonstrates:
Copy code
$ tree
.
ā”œā”€ā”€ config
ā”‚Ā Ā  ā”œā”€ā”€ config.yaml
ā”‚Ā Ā  ā””ā”€ā”€ __init__.py
ā””ā”€ā”€ pyfleet_vehicle_spec
 Ā Ā  ā””ā”€ā”€ source.py

2 directories, 3 files
$ cat config/config.yaml 
hello: world
$ cat pyfleet_vehicle_spec/source.py 
import pkgutil

print(pkgutil.get_data("config", "config.yaml"))
$ python -c 'import pyfleet_vehicle_spec.source'
b'hello: world\n'
Here the complete contents of the config/ directory could be owned by a
resources
target as they likely are in your case.
I'll file here shortly.
@happy-kitchen-89482 & @hundreds-father-404 do you agree its perfectly fine to have a
resources
target own any file type even if we think of certain file types as being generally owned by other targets. In this case a
resources
target owning a resource file + an
__init__.py
file? We can certainly force a user to type out two targets here and then add a manual dependency on both instead of just the resources target, but that seems a bit much to me. And really, to definitely harp on a concept, do we agree this is perfectly fine in the context of the package goal. Target types almost never make sense. The use of metadata by a specific verb is what makes sense of that metadata. In the case of the package goal, the python impl should only care that packages comes from any directory with at least 1 py file, not what target owns that .py file.
So, @refined-addition-53644 pending bug status here, you can work around by splitting all your resources targets in two. One
resources
target to own non-python resource files and one
python_sources
target to own the python files in the resource directory. Then just add both to your explicit dependency list for
lib
. Does that make sense?
FWIW, the bug - if we agree this is a bug - is here: https://github.com/pantsbuild/pants/blob/0f6b523eebda15315c0f92bf4e5e1e1421c084ac/src/python/pants/backend/python/goals/setup_py.py#L848-L861 That packages set is only formed from
python_source(s)
targets.
r
Do you mean something like this @enough-analyst-54434
Copy code
python_sources(
    name="lib_config",
    sources=["config/**/*.py"]
)
resources(
    name="req_config",
    sources=["config/**/*.yaml"]
)

python_sources(
    name="lib_resources",
    sources=["resources/**/*.py"]
)
resources(
    name="req_resources",
    sources=["resources/**/*.csv"]
)

python_sources(
    name="lib_db",
    sources=["database/**/*.py"]
)
resources(
    name="req_db",
    sources=["database/**/*.db"]
)

resources(
    name="req_version",
    sources=["pyfleet_vehicle_spec/VERSION"]
)

python_sources(
    name="lib",
    sources=["pyfleet_vehicle_spec/**/*.py"],
    dependencies=[
        ":lib_config",
        ":req_config",
        ":lib_db",
        ":req_db",
        ":lib_resources",
        ":req_resources",
        ":req_version"],
)

resources(
    name="test_resources",
    sources=["tests/resources/**/*.py", "tests/resources/**/*.csv"]
)

python_tests(
    name="tests",
    sources=["tests/**/test_*.py"],
    dependencies=[":test_resources"],
    runtime_package_dependencies=[":pyfleet-vehicle-spec-dist"])

# Needed together for building the distribtion
python_distribution(
    name="pyfleet-vehicle-spec-dist",
    dependencies=[":lib"],
    wheel=True,
    sdist=True,
    provides=setup_py(
        name="pyfleet_vehicle_spec",
        description="Provides vehicle specification and CO2 emissions.",
        include_package_data=True,
    ),
)
e
Yes. Try that out. It could be simplified, but that's a personal choice and we can talk about that after you confirm the workaround works.
Oh, two things: 1.
setup_py
needs to have
version
- you should be getting an error without that. 2. Remove
include_package_data
r
1. It still doesn't work with the way you suggested - separating python files and resources. 2. I had modified the
setup_py
to read version from local VERSION file using an internal plugin. I just disabled the plugin and ran with the default
setup_py
and still doesn't work. Also I had removed the
include_package_data
I see no non python file being included. I tried to add README.md as a resource which doesn't work either.
e
I do not repro. Here is my setup - what is substantially different from yours?:
Copy code
$ tree .
.
ā”œā”€ā”€ pants
ā”œā”€ā”€ pants_from_sources
ā”œā”€ā”€ pants.toml
ā””ā”€ā”€ pyfleet-vehicle-spec
    ā”œā”€ā”€ BUILD
    ā”œā”€ā”€ config
    ā”‚Ā Ā  ā”œā”€ā”€ config.yaml
    ā”‚Ā Ā  ā””ā”€ā”€ random.py
    ā”œā”€ā”€ pyfleet_vehicle_spec
    ā”‚Ā Ā  ā””ā”€ā”€ source.py
    ā””ā”€ā”€ resources
        ā”œā”€ā”€ __init__.py
        ā””ā”€ā”€ resource.csv

4 directories, 9 files
$ cat pants.toml 
[GLOBAL]
pants_version = "2.8.0"

backend_packages = [
  "pants.backend.python",
]

[anonymous-telemetry]
enabled = false

[source]
root_patterns = ["/pyfleet-vehicle-spec"]
$ cat pyfleet-vehicle-spec/BUILD 
resources(
    name="req_config",
    sources=["config/**/*.yaml"]
)

python_sources(
    name="req_config_srcs",
    sources=["config/**/*.py"]
)

resources(
    name="req_resources",
    sources=["resources/**/*.csv"]
)

python_sources(
    name="req_resources_srcs",
    sources=["resources/**/*.py"]
)

python_sources(
    name="lib",
    sources=["pyfleet_vehicle_spec/**/*.py"],
    dependencies=[
        ":req_config",
        ":req_config_srcs",
        ":req_resources",
        ":req_resources_srcs",
    ],
)

python_distribution(
    name="pyfleet-vehicle-spec-dist",
    dependencies=[":lib"],
    wheel=True,
    sdist=True,
    provides=setup_py(
        name="pyfleet-vehicle-spec",
        version="0.0.1",
    ),
    generate_setup=True,
)
$ ./pants package ::
11:27:45.17 [INFO] Wrote dist/pyfleet-vehicle-spec-0.0.1.tar.gz
11:27:45.17 [INFO] Wrote dist/pyfleet_vehicle_spec-0.0.1-py3-none-any.whl
$ tar -tzf dist/pyfleet-vehicle-spec-0.0.1.tar.gz
pyfleet-vehicle-spec-0.0.1/
pyfleet-vehicle-spec-0.0.1/setup.cfg
pyfleet-vehicle-spec-0.0.1/PKG-INFO
pyfleet-vehicle-spec-0.0.1/setup.py
pyfleet-vehicle-spec-0.0.1/backend_shim.py
pyfleet-vehicle-spec-0.0.1/MANIFEST.in
pyfleet-vehicle-spec-0.0.1/resources/
pyfleet-vehicle-spec-0.0.1/resources/resource.csv
pyfleet-vehicle-spec-0.0.1/resources/__init__.py
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec.egg-info/
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec.egg-info/top_level.txt
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec.egg-info/namespace_packages.txt
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec.egg-info/dependency_links.txt
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec.egg-info/SOURCES.txt
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec.egg-info/PKG-INFO
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec/
pyfleet-vehicle-spec-0.0.1/pyfleet_vehicle_spec/source.py
pyfleet-vehicle-spec-0.0.1/config/
pyfleet-vehicle-spec-0.0.1/config/random.py
pyfleet-vehicle-spec-0.0.1/config/config.yaml
N.B @refined-addition-53644 I definitely do repro before splitting my two resources targets into resources / python sources pairs.
r
That was very stupid. I was unzipping some old dist šŸ¤¦ With the separation of python_sources and resources, it works. Thank you for your time and patience! šŸ™
e
You're welcome. I did not take the time to file an issue, because I'm not sure if my bug assessment will be agreed upon, but this is the fix to be debated: https://github.com/pantsbuild/pants/pull/13878
šŸ™Œ 1