finally giving in and asking for some plugin help....
# plugins
g
finally giving in and asking for some plugin help. hopefully this doesn't waste anyone's time. trying to create a plugin that extends the
publish
goal with a new target,
airflow_composer_dag
. essentially our workflow with airflow is: • write a CLI tool in python • containerize it(docker) • push to GCR(google cloud registry) or GAR(google artifact registry) • rsync our dag files/directories(gsutil/gcloud) to GCS ◦ those dag reference and utilize the container from the previous steps • profit so, first step was a macro that does the
python_source
,
pex_binary
, and
docker_image
with just one definition. but then i need to "publish" the dag definitions(rsync some python and directories). so, i think a plugin is the next logical step. start writing a plugin to extend
publish
and boy am i lost. got it to where i can add the custom target
airflow_composer_dag
to a
BUILD
and
pants publish src/py/thing:dag
doesn't complain, but the
publish
also doesn't do anything. never seems to call my
rule
. source is in the thread. 🧵
register,py
so, first followup question: is this a good candidate for a plugin? second followup is i wonder if there is already a target that can "copy" stuff around. i'm thinking something that can use fsspec, or similar, and then a
pants publish src/py/thing:dag
could be a
file()
target or something similar.
g
OK, starting with "not being called": I think you need both a PublishFieldSet and a PublishRequest UnionRule, and potentially even a Package setup, as all publishes happen after a package. I'm not sure if it's strictly required, but the package output feeds your rule so I assume so. This is what I have for rules:
Copy code
def rules():
    return [
        *collect_rules(),
        UnionRule(PublishFieldSet, PublishImageFieldSet),
        UnionRule(PublishRequest, PublishImageRequest),
    ]
With the following definitons:
Copy code
@dataclass(frozen=True)
class PublishImageRequest(PublishRequest):
    pass


@dataclass(frozen=True)
class PublishImageFieldSet(PublishFieldSet):
    publish_request_type = PublishImageRequest
    required_fields = (
        ImageRepository,
        ImageTag,
    )

    repository: ImageRepository
    tag: ImageTag

    def get_output_data(self) -> PublishOutputData:
        return PublishOutputData(
            {
                "publisher": "skopeo",
                **super().get_output_data(),
            }
        )
Similarly, for packaging:
Copy code
def rules():
    return [
        *collect_rules(),
        UnionRule(PackageFieldSet, ImageFieldSet),
    ]
And
Copy code
@dataclass(frozen=True)
class ImageFieldSet(PackageFieldSet):
    required_fields = (ImageRepository,)

    repository: ImageRepository
    tag: ImageTag

    output_path: OutputPathField

    digest: ImageDigest
All taken from my OCI plugin, which is at least working well enough for our production uses. https://github.com/tgolsson/pants-backends/tree/main/pants-plugins/oci/pants_backend_oci/goals
There's a lot of boilerplate to get started with goals (often of varying formats between goals), but once that is done you'll be much more free to define your rules however you see fit.
As for whether it's useful or not; I think you're onto something: This is a thingy-mover; not necessarily airflow-specific or anything such. I haven't seen such a plugin; and generally do
package
followed by
bash
-scripts. I've personally had uses for a GCS syncer, so if that's what this would be I'd also be interested in using it. 🙂
I'm about to head to bed, but I'll happily give you some more pointers tomorrow if you have more questions, as will a the other folks around here (who might be in your timezone too ;-)). Plugin development can be quite hard at the start because there's a lot to learn, but everyone is welcome to ask and learn 🙂
g
awesome. thanks @gorgeous-winter-99296. i definitely didn't start this as a "make a target that can just move stuff around" but that might be best, and then i can just use a macro to do everything i need. 🙂 will explore that for sure. i'll keep tinkering and see if i can get that rule to fire. maybe it's perma-cached or something...
g
Yeah. It might be helpful to add
--no-pantsd --no-local-cache
to avoid those pitfalls. Now off for real. :D
2
g
also, that OCI plugin seems awesome. love
skopeo
and stuff.