Hey all - I think this may be obvious but I haven...
# general
l
Hey all - I think this may be obvious but I haven't been able to figure it out: If I have some kind of package or archive, and a python_source that does something with the archive (like uploads it to s3 or something like that), is there a way I can set things up so that the python_source has a dependency on generating the archive file?
s
f
I would cautiously say this is currently unsupported. The most obvious way to achieve this is to do:
Copy code
# cheeseshop/BUILD

python_sources()

file(
    name="project-version",
    source="VERSION",
)

archive(
    name="my-archive",
    files=[":project-version"],
    format="zip",
    output_path="uploads/s3.zip"
)
now
Copy code
$ rm -rf dist/; pants package cheeseshop:my-archive
10:34:04.36 [INFO] Wrote dist/uploads/s3.zip

$ tree dist                          
dist
└── uploads
    └── s3.zip

2 directories, 1 file
now
Copy code
# cheeseshop/upload_archive.py
from zipfile import ZipFile
print(ZipFile("dist/uploads/s3.zip").namelist())
assert ZipFile("dist/uploads/s3.zip").namelist() == ["cheeseshop/VERSION"]
do
Copy code
$ pants run cheeseshop/upload_archive.py
['cheeseshop/VERSION']
so with this your workflow would be a shell script:
Copy code
$ pants package cheeseshop:my-archive
$ pants run cheeseshop/upload_archive.py
what you want to do IIUC is to be able to just run
pants run cheeseshop/upload_archive.py
and whatever runtime dependencies it has (explicitly declared manually) should be satisfied: in this case, it's an
archive
target
my-archive
. I don't think you can do this with
python_sources
. This is something that is supported for
python_tests
however, see https://www.pantsbuild.org/2.19/reference/targets/python_tests#runtime_package_dependencies. Perhaps it's not that dumb to have this field for
python_sources
, too; e.g. you may indeed require having certain artifacts produced before running a script (uploading archives is a good use case, same for pushing system packages to a repository etc). Before we have support for this (if at all), a somewhat cleaner way to document that your sources depend on some packages (archives in your case) is to add those dependencies manually with `overrides`:
Copy code
python_sources(
    overrides={"upload_archive.py": {"dependencies": [":my-archive-1", ":my-archive-2"]}},
)

archive(
    name="my-archive-1",
    files=[":project-version"],
    format="zip",
    output_path="uploads/s3-1.zip"
)

archive(
    name="my-archive-2",
    files=[":project-version"],
    format="zip",
    output_path="uploads/s3-2.zip"
so now first package only those archives that are needed:
Copy code
$ pants dependencies cheeseshop/upload_archive.py | xargs pants --filter-target-type=archive package
10:42:10.57 [INFO] Wrote dist/uploads/s3-1.zip
10:42:10.57 [INFO] Wrote dist/uploads/s3-2.zip
and then
Copy code
$ pants run cheeseshop/upload_archive.py
having all those prerequisites packaged
l
This was an excellent run-down, thank you! You have indeed captured what I'm trying to do, and some of the things I've tried that don't quite accomplish this.
Another facet of it is that I think this line in your example:
Copy code
print(ZipFile("dist/uploads/s3.zip").namelist())
depends on running the command from the pants buildroot. One reason I would like to do this using dependencies is to take advantage of the sandbox environment, so that I don't need a way for the script to find the archive at runtime regardless of the context in which it's run.
I was exploring doing this with an adhoc_tool, but it's the first time I've tried to write one of those, so I got a bit lost. My idea was to have an archive or pex_binary target as an execution_dependency, and then expose the archive file as an output file, and then depend on that in a python_source. Does that flow even make sense from the perspective of what an adhoc_tool can do? If not, I'll stop fiddling with it...
Is there a reason
runtime_package_dependencies
only exists for test targets? I think that does exactly what I want to do, but on a "runnable" target. Or is it just "nobody has wanted this badly enough to implement it yet"?
f
> My idea was to have an archive or pex_binary target as an execution_dependency, and then expose the archive file as an output file, and then depend on that in a python_source. I am not sure I follow you here, I am afraid. I haven't used
adhoc_tool
yet, though, so can't really help much with this. > Is there a reason
runtime_package_dependencies
only exists for test targets? I was able to find one thread that seems to be related to the discussion we have, but it's fairly old, so I am not sure how relevant it is. > Or is it just "nobody has wanted this badly enough to implement it yet"? I think there is a concern that certain types of packages will start depending on other types of packages where it doesn't make sense which may lead to confusion or obscure bugs (e.g. a pex binary depending on another pex binary, but also include the sources of that binary as dependencies, that kind of thing). If you're interested, perhaps you could file an issue to advocate having this field (or in some other way be able to have packages at runtime) for the
python_source
target? Then other maintainers could chip in?
🙏 1
l
I am not sure I follow you here, I am afraid. I haven't used
adhoc_tool
yet, though, so can't really help much with this.
No worries at all, thank you for your engagement on this!
If you're interested, perhaps you could file an issue to advocate having this field (or in some other way be able to have packages at runtime) for the
python_source
target?
Yes, thank you! I've been wanting to exhaust other options first, but yeah, I think that's probably the right path forward.
l
@lemon-oxygen-2929 I ran into a similar issue and I think I've solved it here https://pantsbuild.slack.com/archives/C046T6T9U/p1724828935438999?thread_ts=1724752064.431699&cid=C046T6T9U
🙏 1
l
Interesting! I don't totally grok it, but it does seem to do what I want!
👍 1