Hey! we have a Python repository where Pants cou...
# general
r
Hey! we have a Python repository where Pants could help us quite a but, but we are struggling to come up with the right usage given our (complex) repository structure. In abstract terms the repository looks like this:
Copy code
src/
   plugins/foo/schema.py
   plugins/bar/schema.py
   plugin_registry.py
   webserver.py
test/
   plugins/foo/test_foo.py
   plugins/bar/test_bar.py
The challenge is that
plugin_registry.py
is loading those plugins dynamically using
importlib
. Any usage of a plugin goes through the registry as a facade. In practice this means that there is no import of
foo/schema.py
in
test_foo.py
but we only import the registry and have it load the right plugin for us. In some integration tests it gets even more complicated. The registry and plugins are loaded in another process and the tests only call a REST API, without any import of registry or plugins in the test code. This usage does not play well with the pants auto-discovery of Python import dependencies. We can think of two ways out of this, but both have their downside (see thread). Is there any option we have missed? I am curious if you have further ideas for us
Option A: List plugins as registry dependencies This is what we have tried so far, in the BUILD file of the registry:
Copy code
python_sources(
    sources=["!plugin_registry.py", "*.py"]
)

target(
    name="resource-plugins",
    description="List of all resources dynamically loaded by the registry",
    dependencies=[
        "src/plugins/foo",
        "src/plugins/bar",
    ],
)

python_sources(
    name="registry", 
    sources=["plugin_registry.py"], 
    dependencies=[":resource-plugins"]
)
• Advantage: This works from a Pants perspective as dependencies are correctly detected and is rather easy to maintain. • Disadvantage: Dependencies are very coarse grained leading to many unnecessary invalidations. For example, if I change
bar
also all tests for
foo
will be executed even though this is unnecessary
Option B: Explicitly list plugins as dependencies where they are needed This would be similar to the solution above, but instead of having the registry depend on the plugins, we would have the each plugin depend on the registry. In addition, we would need an explicit dependency wherever we would expect a plugin to be present. For example: for
test_foo.py
we would need to add
dependencies=["src/plugins/foo"]
in the associated BUILD file. The same dependency would also need to be added for the multiple binaries we ship. So far these places have just pulled in the registry. • Advantage: Dependencies are more fine grained which should help fine grained invalidation and thus test performance • Disadvantage: I expect push backs from the other developers here. Instead of just dropping a new plugin into a folder, we suddenly have to adjust multiple places to correctly declare a dependency on any new plugin it. Any mistake we make here might mean we forget to run tests even though relevant plugins have changed.
If you have any other idea, I am all ears. Thanks :)
h
How does
test_foo.py
tell the registry "I need
foo
but not `bar`" ?
Is there some way to tell this statically? By naming alone even?
r
I have been thinking about this a bit. You are probably right that we need to express things statically (e.g., by importing a specific type or module for each plugin). That way devs just need to care about Python and pants will do the rest in the backend just based on the inferred dependencies. I will give it a shot and then try to get back to you. Thanks for the nudge
👍 1
❤️ 1
h
I was wondering about either using a macro that can compute the dep statically (e.g., based on directory names), or if not, writing a plugin to customize the dep inference behavior.
But of course if you can simplify and have existing dep inference work, that might be best