Hi, all. I seem to be missing something pretty si...
# general
w
Hi, all. I seem to be missing something pretty simple, but I've not been able to find solutions from searching previous chats here or in online help. Please pardon me if I have failed in my googling. Anyway... How can I get a python_distribution target to include a REAME.md and LICENSE file at the top level of the resulting tar.gz file? I've tried file() and resource() targets and included them as dependencies for the python_distribution, but nothing is working for me. Any suggestions?
g
How are you configuring it? manual
setup.py
, generating a
setup.py
with pants, or
pyproject.toml
?
w
Hello, Tom, and thanks for your response. Pants is producing the setup.py using BUILD file directives.
g
Ack! I can have a poke in the code later today, I've personally never used it. I know it works with the pyproject.toml approach but that's mostly driven by PEP517/518/639
w
Thanks! I pondered doing a manual setup.py approach, but with everything else is working so beautifully with the auto-generated setup.py. I have been hesitant to change that.
g
https://github.com/tgolsson/pants-repros/tree/main/dist-license -- as far as I can tell; it "just works".
Copy code
$ pants package ::
22:30:33.04 [INFO] Wrote dist/package-0.1.0-py3-none-any.whl
22:30:33.04 [INFO] Wrote dist/package-0.1.0.tar.gz

$ tar tvf dist/package-0.1.0.tar.gz
drwxr-xr-x ts/ts             0 2025-06-02 22:26 package-0.1.0/
-rw-r--r-- ts/ts             7 2025-06-02 22:26 package-0.1.0/LICENSE.txt
-rw-r--r-- ts/ts            12 2025-06-02 22:26 package-0.1.0/MANIFEST.in
-rw-r--r-- ts/ts           144 2025-06-02 22:26 package-0.1.0/PKG-INFO
-rw-r--r-- ts/ts             6 2025-06-02 22:26 package-0.1.0/README.md
-rw-r--r-- ts/ts           762 2025-06-02 22:26 package-0.1.0/backend_shim.py
drwxr-xr-x ts/ts             0 2025-06-02 22:26 package-0.1.0/mylib/
-rw-r--r-- ts/ts             0 2025-06-02 22:26 package-0.1.0/mylib/__init__.py
-rw-r--r-- ts/ts             0 2025-06-02 22:26 package-0.1.0/mylib/hello.py
drwxr-xr-x ts/ts             0 2025-06-02 22:26 package-0.1.0/package.egg-info/
-rw-r--r-- ts/ts           144 2025-06-02 22:26 package-0.1.0/package.egg-info/PKG-INFO
-rw-r--r-- ts/ts           255 2025-06-02 22:26 package-0.1.0/package.egg-info/SOURCES.txt
-rw-r--r-- ts/ts             1 2025-06-02 22:26 package-0.1.0/package.egg-info/dependency_links.txt
-rw-r--r-- ts/ts             1 2025-06-02 22:26 package-0.1.0/package.egg-info/namespace_packages.txt
-rw-r--r-- ts/ts             6 2025-06-02 22:26 package-0.1.0/package.egg-info/top_level.txt
-rw-r--r-- ts/ts            38 2025-06-02 22:26 package-0.1.0/setup.cfg
-rw-r--r-- ts/ts           423 2025-06-02 22:26 package-0.1.0/setup.py

$ zipinfo dist/package-0.1.0-py3-none-any.whl
Archive:  dist/package-0.1.0-py3-none-any.whl
Zip file size: 1716 bytes, number of entries: 8
-rw-r--r--  2.0 unx        0 b- defN 25-Jun-02 20:26 mylib/__init__.py
-rw-r--r--  2.0 unx        0 b- defN 25-Jun-02 20:26 mylib/hello.py
-rw-r--r--  2.0 unx        7 b- defN 25-Jun-02 20:26 package-0.1.0.dist-info/LICENSE.txt
-rw-r--r--  2.0 unx      144 b- defN 25-Jun-02 20:26 package-0.1.0.dist-info/METADATA
-rw-r--r--  2.0 unx       91 b- defN 25-Jun-02 20:26 package-0.1.0.dist-info/WHEEL
-rw-r--r--  2.0 unx        1 b- defN 25-Jun-02 20:26 package-0.1.0.dist-info/namespace_packages.txt
-rw-r--r--  2.0 unx        6 b- defN 25-Jun-02 20:26 package-0.1.0.dist-info/top_level.txt
-rw-rw-r--  2.0 unx      624 b- defN 25-Jun-02 20:26 package-0.1.0.dist-info/RECORD
8 files, 873 bytes uncompressed, 606 bytes compressed:  30.6%
(I'm unsure if I should expect a README.md in the wheel, but I don't think so... Setuptools only mentions sdists containing it.)
w
Excellent. Can you share your BUILD file contents?
Oh, just saw the link. Checking it now.
Well, looks like my BUILD is effectively identical. However, whenever I try to include
Copy code
long_description_path="README.md"
pants blows up with an unmatched glob error. This is weird because when I
Copy code
pants peek README.md
I get nice output suggesting all is well, and same for peek on the python_distribution target. Relevant error message is:
Copy code
Unmatched glob from the long_description_path field of python/server/v2_1/launchers:server: "README.md"
It is strange because that file is there and "peek" sees it.
I'm on pants 2.26.0, fwiw
I cloned you repo, and it works as expected. Whatever might be different is really not obvious, but you have given me the framework to find an answer. Thanks for your help. I'll report back what I learn.
I tried copying the license-dist sample into my pants repo and when I do it stops working. My pants.toml backend_packages include all three listed in your example pants.toml. I'm using the same pants version,. Only real difference is my root_patterns are different. I tried --keep-sandboxes=always, but the only /tmp/pants-sandbox-* folders generated are empty. I guess
pants package
doesn't use sandboxes for python_distribution targets? I've spent many hours since Friday trying to figure out this curious puzzle. I think it's time for me to admit defeat and simply repackage the tar ball with the necessary files by hand. My guess is that I have a repo structure that pants doesn't expect. I forked your repo and made a few modifications to mirror how our repo is structured. If you run
pants package ::
in the something/python/dist-license folder you'll see that the resulting tar.gz file does not include the LICENSE.txt or README.md files. You can find my fork here: https://github.com/stormfish-sci/pants-repros/tree/main
g
Awesome repro, ok. Re empty pants-sandbox folders; use
--no-local-cache
as well; and blow away all sandboxes between runs. Also helps to pkill pantsd... all-in-all,
pkill pantsd; rm -rf /tmp/pants-sandbox-*; pants --keep-sandboxes=always --no-local-cache ...
. I found your issue as well, your root patterns are one level too far up; so during the sandbox generation we put the license.txt in a subdir.
Copy code
/tmp/pants-sandbox-C0mr0p/chroot/dist-license
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/LICENSE.txt
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/README.md
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/src
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/src/py
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/src/py/mylib
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/src/py/mylib/__init__.py
/tmp/pants-sandbox-C0mr0p/chroot/dist-license/src/py/mylib/hello.py
Replacing
python
root pattern with the actual root dir of the package;
Copy code
/tmp/pants-sandbox-GET9Wn/chroot/LICENSE.txt
/tmp/pants-sandbox-GET9Wn/chroot/MANIFEST.in
/tmp/pants-sandbox-GET9Wn/chroot/README.md
<snip>
/tmp/pants-sandbox-GET9Wn/chroot/src
/tmp/pants-sandbox-GET9Wn/chroot/src/py
/tmp/pants-sandbox-GET9Wn/chroot/src/py/mylib
/tmp/pants-sandbox-GET9Wn/chroot/src/py/mylib/__init__.py
/tmp/pants-sandbox-GET9Wn/chroot/src/py/mylib/hello.py
/tmp/pants-sandbox-GET9Wn/chroot/src/py/mylib
/tmp/pants-sandbox-GET9Wn/chroot/src/py/mylib/__init__.py
/tmp/pants-sandbox-GET9Wn/chroot/src/py/mylib/hello.py
w
Tom, thanks for digging into this and for all the great recommendations! Our repo is designed using the "Multiple top-level projects" (MTLP) structure from the pantsbuild docs: https://www.pantsbuild.org/2.24/docs/using-pants/key-concepts/source-roots#multiple-top-level-projects. This has worked very well for us. Does your statement about "root patterns are one level too far up" suggest that the problem is related to our use of the MTLP repo structure?
g
No, we should be able to make it work there as well. Can you just in text or dirtree explain how 2-3 projects would look with your disk structure?
Specific reason I ask is I'm unsure how you'd arrange multiple licenses and multiple readmes
If you only build one python distribution then this doesn't really matter, but for my own clarity :)
w
Sure. I will try to get you some insights after I get through a few meetings.
👍 1
Here is a conceptual layout that mirrors how our monorepo is structured:
g
Is "python code here" a dir or files?
w
"python code here" are 1..n python files So perhaps
keyproduct_1/python/keyproduct_1/server/v2
is where v2 of our commercial version lives with a proprietery license.
v2_community_edition
would have an open source license.
and depending on the complexity of the software could be files and/or sub-directories of files.
g
It's a bit icky but it looks correct when building the wheel when reverting the root and putting this into the BUILD file:
Copy code
# something/python/dist-license/BUILD
python_sources(name="src")
resources(name="package_data", sources=["README.md", "LICENSE.txt"])

python_distribution(
    name="dist-license",
    dependencies=[
        ":package_data",
        ":src",
    ],
    provides=python_artifact(
        name="package",
        version="0.1.1",
        long_description_content_type="markdown",
        license_files=["dist-license/LICENSE.txt"],
    ),
    generate_setup=True,
    long_description_path="dist-license/README.md",
)
In the sdist this ends up keeping the README.md and LICENSE.txt in the dist-license dir, not in the root (hence - icky) but it does generate correct METADATA and LICENSE.txt entries in dist-info. It is a bit odd to specify the dir-name in the long-description-path and license-files, but it's necessary AFAICT to make both pants and setuptools happy. I think the other alternative would be to move the
python_distribution
, as well as the license and readme up one level, but then you'd need one shared README/license for each top-level project.
w
Thanks for your experimentation with this. It seems as though there are some aspects of pants inner workings which impose constraints that are not obvious. If nothing else I feel a bit better knowing that it wasn't just me misreading the docs (unless it was - haha). I didn't put it in the dir structure I sent, but our pants.toml file lives in the root folder. Are your updates available in your repo so I can take a holistic look at the changes you made to get it working?
g
Updated here. https://github.com/tgolsson/pants-repros/tree/main/dist-license-mtlp . Good to call out the pants.toml, adapting that a bit showed an issue that had hidden since the
dist-license
directory still existed in the root as well. The
long_description_path
should be relative to build root; so I used the
build_file_dir
helper to make it a bit more resilient. The reason the pants.toml is in a subdir in is because I keep all my repros in one repository, one "project" per directory.
w
Thank you. I'm intrigued with you multiple
pants.toml
files. Does this prevent you from referencing across projects? I.e., if you updated code in a project used by others with different 'pants.toml' files, pants cannot test all projects dependent on it, right? I don't intend to suggest one is "right" vs "wrong" just hoping to grow my understanding of pros/cons of different approaches 🙂
g
I think you're overthinking this part. 🙂 They're fully isolated; it's not a Pants thing beyond "pants searches for a parent pants.toml and then stops". So while yes, it does prevent me from referencing across projects etc... the core reason is just that it's easier to do each repro in vitro as opposed to in a larger system.
w
That makes sense. Thanks!
Hi Tom. I tried your example dist-license-mtlp (commit 466cbe6), and I'm not getting the tar file contents as expected. Here's what I see:
Copy code
$ tar tzf ~/git/3rd-party/pants-repros/dist-license-mtlp/dist/dist_license-0.1.1.tar.gz 
dist_license-0.1.1/
dist_license-0.1.1/MANIFEST.in
dist_license-0.1.1/PKG-INFO
dist_license-0.1.1/backend_shim.py
dist_license-0.1.1/dist-license/
dist_license-0.1.1/dist-license/LICENSE.txt
dist_license-0.1.1/dist-license/README.md
dist_license-0.1.1/dist-license/__init__.py
dist_license-0.1.1/dist-license/hello.py
dist_license-0.1.1/dist_license.egg-info/
dist_license-0.1.1/dist_license.egg-info/PKG-INFO
dist_license-0.1.1/dist_license.egg-info/SOURCES.txt
dist_license-0.1.1/dist_license.egg-info/dependency_links.txt
dist_license-0.1.1/dist_license.egg-info/namespace_packages.txt
dist_license-0.1.1/dist_license.egg-info/top_level.txt
dist_license-0.1.1/setup.cfg
dist_license-0.1.1/setup.py
Do you get this result or does yours include the LICENSE.txt and README.md at top level of tar ball?
g
That is what I mentioned being icky. As far as I can tell it's fine to have an sdist with this layout, just unusual.
w
Ah.. Meaning that it is fine to have the license and readme deeper in the structure?
g
Yes, I believe so. The metadata has the correct contents and points at the valid license. When you install the sdist the license ends up in the correct place.
w
That works for me then. Your assistance resolved the glob errors I got from long description usage. I have applied your suggestions and I'm testing everything now.
Nice! Looks like this worked! All the package metadata appears as expected. Thanks for all your assistance, Tom!