question about how to inform pants of some `depend...
# general
p
question about how to inform pants of some
dependencies
In the old Makefile, in one of the targets that happened before running tests, it looped over
contrib/runners/*
and ran
python setup.py develop --no-deps
(deps were already installed in another step). How do I do something similar? ie: Whenever running any of the tests under https://github.com/st2sandbox/st2/tree/pants/st2common/tests, I need to make sure that each of the packages under https://github.com/st2sandbox/st2/tree/pants/contrib/runners is installed.
I tried adding
dependencies=["contrib/runners/*"]
in st2common/tests/BUILD's python_library() declaration, but that resulted in:
Copy code
DifferingFamiliesError: Expected AddressMaps to share the same parent directory 'contrib/runners/*', but received: 'contrib/runners/action_chain_runner/BUILD'
Same for
contrib/runners/*_runner/*_runner
But depending on
contrib/runners
complains that there are no BUILD files in that dir (which there aren't).
e
So, is there a need to use a dist install (
python setup.py develop ...
) instead of just having the tests depend on the code directly?
p
Oh I want to get rid of that dist_utils cruft
So, we should pretend that setup.py and dist_utils.py don't exist 😄
e
Yeah - unless you're testing that the distribution infrastructure works (setup.py is well formed, etc) - there should be no need to use
python setup.py develop
in a Pants repo.
p
e
Ok, but are you planning on handing that over to Pants? We handle that for you.
You should be able to remove all setup.py from the repo.
p
I need to somehow tell pants that each of the directories in contrib/runners provides that
st2common.runners.runner
entrypoint, and then have the tests include all of the directories that have that entrypoint
We're using stevedore for extensions, so the tests that depend on the extensions are not going to be immediately visible to pants as the dep is via entrypoints.
You should be able to remove all setup.py from the repo.
I plan to, but not until pants is merged into the main repo.
Aha - ok. Rewinding and thinking about your original question then...
We do not allow wild card dependencies for one - at all.
p
That would be why wildcards failed miserably 😄
e
Yeah. Ok, So your minimal requirement IIUC is entrypoint metadata on the PYTHONPATH.
p
I think so
e
You get that today via
python setup.py develop
, but other means might be fine.
p
yes
does
python_distribution
inform anything with dep inference? Or some kind of target I can depend on?
e
No. It just uses dep inference. We have a strong bias towards treating in-repo code as just code and not going through package it as a dist just to install it again for local use in the repo itself.
👍 1
You need a bridge here at the very least.
p
bridge = pants plugin?
e
Yeah, likely. I'm still trying to come up with something clever short of that.
Without a custom goal you'll need to leverage this: https://www.pantsbuild.org/docs/python-test-goal#depending-on-packages And then add a conftest.py hook that does the package installing.
p
hmm. That would mean packages would be required even for unit tests... that could get expensive, right?
e
Packages are only created once, Pants is smart.
The expense comes in the tests that do do the install via conftest.py hook. Since we run each test file in isolation in a sandbox, each such test will run the
python setup,.py develop
all over again, If that step is slow, thats a problem.
p
some of the runners have significant deps, so it could. How is
runtime_package_dependencies
a conftest.py hook?
e
It is not - 2 seperate things.
So
runtime_package_dependencies
just gets you packages built and deposited at $CWD.
p
oh
e
Now, if you have a test depend on conftest.py, then pytest magic finds that conftest.py file and runs any lifecycle magic functions you have there. There are several that deal with test session startup.
You hook one of those with code that does a subprocess.run ... and does the setup.py develop stuff. You then add the target of the setup.py develop stuff to sys.path in the conftest.py hook.
Make sense?
pytest_sessionstart
is probably appropriate for this: https://docs.pytest.org/en/6.2.x/reference.html#bootstrapping-hooks
Forgive the links if you're already conftest.py / pytest fluent in this corner. And conftest.py: https://docs.pytest.org/en/6.2.x/writing_plugins.html#localplugin
p
So, how might it look going the plugin route?
stevedore(name="st2common.runners.runner", entry_points=["action-chain = action_chain_runner.action_chain_runner"])
target that defines the entrypoints (in each of the runners), and then add
python_library(stevedore_dependencies=["st2common.runners.runner"])
which would trigger a rule that adds the python files as dependencies, and somehow injecting the entrypoint metadata.
(I am trying to eliminate a painful world of Makefile hacks, so if I can avoid adding new hacks that obfuscate the deps, that would be great.)
e
So, this is hard to answer well without understanding your code - which I don't yet. IIUC you don't just use stevedore extensions in a Pants repo - that presumably just works. You implement stevedore extensions in a Pants repo and then you want to have stevedore find those extensions via its entrypoint metadata plugin mechanism - is that right?
p
Yes. We both consume and provide the stevedore extensions in the same repo.
theoretically, others could add their own extensions, but that doesn't matter for our tests which only use the extensions defined in-repo.
e
I'm reading the stevedore docs to see if they have a manual registration method. That would be one path in tests. ExtensionManager(...).register(test_this).register(test_that_too)... etc.
It does not look like that is supported though. Apparently entry point metadata or bust.
p
Maybe not in the docs, but I just found one in the code: https://opendev.org/openstack/stevedore/src/branch/master/stevedore/extension.py#L140
And I guess that method is on a variety of objects, not just the ExtensionManager: https://docs.openstack.org/stevedore/ocata/managers.html
So, the pants plugin wouldn't have to inject actual setuptools entry_points info. Instead, it could build some kind of index file that can be ingested in a pytest fixture to populate the ExtensionManager.
e
Yup - that would work.
p
That feels better than having pytest install stuff in a fixture.
e
Lots of ways to go. Could be a codegen hook and take setup.py (preferrably setup.cfg since its declarative) and output
dist-name-dist-version.dist-info/entrypoints.txt
Then the normal stevedore extension loading via importlib.metadata.entry_points() would work just like in production.
p
output that where? Inside one of the various pex files used for pytest?
e
A codegen rule would output that as the output of a Process. You'd specify that output be captured and then you'd access it by its Digest. That Digest would get added in to downstream test sandboxes.
So, in other words, codegen just produces files from other files and the codegenned files get added to downstream rule Process sandboxes.
As normal files / code, just generated on the fly.
p
hmm. So it still needs to be a separate Process, I can't do it from within the plugin itself.
e
Any real work in a plugin needs to be a seperate
Process
.
Again - its super confusing that the rule language is python, You can't really do anything in Python in the rule body except execute `Process`es, and then manipulate Digests and look at user options or Target metadata.
p
Cool. I'll try my hand and come back with questions 🙂 Sounds like we could use a special linter (or custom flake8 rules?) for pants plugins to warn when they do things other than use Process.
e
It would probably be better to install one of these: https://www.python.org/dev/peps/pep-0578/ I imagine it slows things down, so if we had it we'd want to just flag it on for testing plugins, not in prod.
For now, code review is the defense.
p
I'm reusing or copying and slightly modifying a lot of the Pex EntryPoint related code for this stevedore_extensions pants plugin.
I made a
stevedore_extension
target that defines the entry_point metadata (using a lot of logic lifted from pex stuff). Now, I need to generate that entry_points.txt file, but to do that, I need all of the
stevedore_extension
targets in a BUILD file, not just the one that triggers the
GenerateSourcesRequest
. Is there a good idiom for generating sources for multiple targets in the same rule? I imagine I could do some kind of
Get()
to get all of those targets, but then I'm concerned about each of the
GenerateSourcesRequest
saying that they conflict because they generate the same file (though the contents should be the same). Is that the way to go?
Copy code
@rule(desc="Generate entry_points.txt from stevedore_extension target metadata", level=LogLevel.DEBUG)
async def generate_entry_points_txt_from_stevedore_extension(
    request: GenerateEntryPointsTxtFromStevedoreExtensionRequest,
) -> GeneratedSources:
    # similar to relocate_files, this isn't standard codegen.
    # It uses the metadata on targets as source instead of source files.

    sibling_targets = await Get(
        Targets, AddressSpecs([SiblingAddresses(request.protocol_target.address)]),
    )
    stevedore_targets = [
        tgt for tgt in sibling_targets if tgt.has_field(StevedoreEntryPointsField)
    ]
    resolved_entry_points = await MultiGet(
        Get(ResolvedStevedoreEntryPoints, ResolveStevedoreEntryPointsRequest(tgt.entry_points))
        for tgt in stevedore_targets
    )

    entry_points_txt_path = ""  # TODO: where does this go?
    entry_points_txt_contents = ""

    stevedore_extension: StevedoreExtension
    for i, stevedore_extension in enumerate(stevedore_targets):
        namespace: StevedoreNamespaceField = stevedore_extension[StevedoreNamespaceField]
        entry_points: StevedoreEntryPointsField = resolved_entry_points[i].val
        if not entry_points:
            continue

        entry_points_txt_contents += f"[{namespace.value}]\n"
        for entry_point in entry_points.value:
            # TODO: I need the resolved entry points here...
            entry_points_txt_contents += f"{entry_point.name} = {entry_point.value.spec}\n"
        entry_points_txt_contents += "\n"

    entry_points_txt_contents = entry_points_txt_contents.encode("utf-8")

    snapshot = await Get(
        Snapshot,
        Digest,
        CreateDigest([FileContent(entry_points_txt_path, entry_points_txt_contents)]),
    )
    return GeneratedSources(snapshot)
So, I didn't use
Process
here because I wasn't working with files, just target metadata... wdyt? Also, where do I put the
dist-name-dist-version.dist-info/entrypoints.txt
so that it gets included in the pexes of sources that get built for pytest? Or does it just need to be alongside the code and by virtue of being on the PYTHONPATH, the dist-info gets picked up?
e
Yeah, target metadata is fair game in rules. I was just trying to underscore no reading files via open, etc with the Process thing. As to the latter, yes - the idea is to get metadata on the PYTHONPATH. It would really serve you well to figure this out / get it working in a manual tmp dir sandbox you hand construct first. Then you can write rule code with confidence you have a working mechanism to aim for.
I'm out here presently and for the next 8 days; so someone else will need to step in if you have more questions.
p
ok. Thanks for all your help!