question about how to inform pants of some `dependencies` In Pants #general

question about how to inform pants of some `depend...

proud-dentist-22844

05/19/2021, 12:31 AM

question about how to inform pants of some

dependencies

In the old Makefile, in one of the targets that happened before running tests, it looped over

contrib/runners/*

and ran

python setup.py develop --no-deps

(deps were already installed in another step). How do I do something similar? ie: Whenever running any of the tests under https://github.com/st2sandbox/st2/tree/pants/st2common/tests, I need to make sure that each of the packages under https://github.com/st2sandbox/st2/tree/pants/contrib/runners is installed.

proud-dentist-22844

05/19/2021, 12:33 AM

I tried adding

dependencies=["contrib/runners/*"]

in st2common/tests/BUILD's python_library() declaration, but that resulted in:

Copy code

DifferingFamiliesError: Expected AddressMaps to share the same parent directory 'contrib/runners/*', but received: 'contrib/runners/action_chain_runner/BUILD'

proud-dentist-22844

05/19/2021, 12:37 AM

Same for

contrib/runners/*_runner/*_runner

But depending on

contrib/runners

complains that there are no BUILD files in that dir (which there aren't).

enough-analyst-54434

05/19/2021, 12:42 AM

So, is there a need to use a dist install (

python setup.py develop ...

) instead of just having the tests depend on the code directly?

proud-dentist-22844

05/19/2021, 12:42 AM

Oh I want to get rid of that dist_utils cruft

proud-dentist-22844

05/19/2021, 12:43 AM

So, we should pretend that setup.py and dist_utils.py don't exist 😄

enough-analyst-54434

05/19/2021, 12:44 AM

Yeah - unless you're testing that the distribution infrastructure works (setup.py is well formed, etc) - there should be no need to use

python setup.py develop

in a Pants repo.

proud-dentist-22844

05/19/2021, 12:44 AM

So, where setup.py matters, is the entry points: https://github.com/st2sandbox/st2/blob/pants/contrib/runners/action_chain_runner/setup.py#L53-L57

enough-analyst-54434

05/19/2021, 12:45 AM

Ok, but are you planning on handing that over to Pants? We handle that for you.

enough-analyst-54434

05/19/2021, 12:45 AM

You should be able to remove all setup.py from the repo.

proud-dentist-22844

05/19/2021, 12:45 AM

I need to somehow tell pants that each of the directories in contrib/runners provides that

st2common.runners.runner

entrypoint, and then have the tests include all of the directories that have that entrypoint

proud-dentist-22844

05/19/2021, 12:46 AM

We're using stevedore for extensions, so the tests that depend on the extensions are not going to be immediately visible to pants as the dep is via entrypoints.

proud-dentist-22844

05/19/2021, 12:46 AM

You should be able to remove all setup.py from the repo.

I plan to, but not until pants is merged into the main repo.

enough-analyst-54434

05/19/2021, 12:47 AM

Via: https://www.pantsbuild.org/docs/python-distributions#configuring-distributions

enough-analyst-54434

05/19/2021, 12:47 AM

Aha - ok. Rewinding and thinking about your original question then...

enough-analyst-54434

05/19/2021, 12:48 AM

We do not allow wild card dependencies for one - at all.

proud-dentist-22844

05/19/2021, 12:48 AM

That would be why wildcards failed miserably 😄

enough-analyst-54434

05/19/2021, 12:49 AM

Yeah. Ok, So your minimal requirement IIUC is entrypoint metadata on the PYTHONPATH.

proud-dentist-22844

05/19/2021, 12:49 AM

I think so

enough-analyst-54434

05/19/2021, 12:49 AM

You get that today via

python setup.py develop

, but other means might be fine.

proud-dentist-22844

05/19/2021, 12:50 AM

yes

proud-dentist-22844

05/19/2021, 12:51 AM

does

python_distribution

inform anything with dep inference? Or some kind of target I can depend on?

enough-analyst-54434

05/19/2021, 12:52 AM

No. It just uses dep inference. We have a strong bias towards treating in-repo code as just code and not going through package it as a dist just to install it again for local use in the repo itself.

👍 1

enough-analyst-54434

05/19/2021, 12:52 AM

You need a bridge here at the very least.

proud-dentist-22844

05/19/2021, 12:52 AM

bridge = pants plugin?

enough-analyst-54434

05/19/2021, 12:52 AM

Yeah, likely. I'm still trying to come up with something clever short of that.

enough-analyst-54434

05/19/2021, 12:54 AM

Without a custom goal you'll need to leverage this: https://www.pantsbuild.org/docs/python-test-goal#depending-on-packages And then add a conftest.py hook that does the package installing.

proud-dentist-22844

05/19/2021, 12:55 AM

hmm. That would mean packages would be required even for unit tests... that could get expensive, right?

enough-analyst-54434

05/19/2021, 12:55 AM

Packages are only created once, Pants is smart.

enough-analyst-54434

05/19/2021, 12:56 AM

The expense comes in the tests that do do the install via conftest.py hook. Since we run each test file in isolation in a sandbox, each such test will run the

python setup,.py develop

all over again, If that step is slow, thats a problem.

proud-dentist-22844

05/19/2021, 12:58 AM

some of the runners have significant deps, so it could. How is

runtime_package_dependencies

a conftest.py hook?

enough-analyst-54434

05/19/2021, 12:59 AM

It is not - 2 seperate things.

enough-analyst-54434

05/19/2021, 12:59 AM

runtime_package_dependencies

just gets you packages built and deposited at $CWD.

proud-dentist-22844

05/19/2021, 12:59 AM

enough-analyst-54434

05/19/2021, 1:00 AM

Now, if you have a test depend on conftest.py, then pytest magic finds that conftest.py file and runs any lifecycle magic functions you have there. There are several that deal with test session startup.

enough-analyst-54434

05/19/2021, 1:01 AM

You hook one of those with code that does a subprocess.run ... and does the setup.py develop stuff. You then add the target of the setup.py develop stuff to sys.path in the conftest.py hook.

enough-analyst-54434

05/19/2021, 1:01 AM

Make sense?

enough-analyst-54434

05/19/2021, 1:05 AM

pytest_sessionstart

is probably appropriate for this: https://docs.pytest.org/en/6.2.x/reference.html#bootstrapping-hooks

enough-analyst-54434

05/19/2021, 1:06 AM

Forgive the links if you're already conftest.py / pytest fluent in this corner. And conftest.py: https://docs.pytest.org/en/6.2.x/writing_plugins.html#localplugin

proud-dentist-22844

05/19/2021, 1:25 AM

So, how might it look going the plugin route?

stevedore(name="st2common.runners.runner", entry_points=["action-chain = action_chain_runner.action_chain_runner"])

target that defines the entrypoints (in each of the runners), and then add

python_library(stevedore_dependencies=["st2common.runners.runner"])

which would trigger a rule that adds the python files as dependencies, and somehow injecting the entrypoint metadata.

proud-dentist-22844

05/19/2021, 1:27 AM

(I am trying to eliminate a painful world of Makefile hacks, so if I can avoid adding new hacks that obfuscate the deps, that would be great.)

enough-analyst-54434

05/19/2021, 1:36 AM

So, this is hard to answer well without understanding your code - which I don't yet. IIUC you don't just use stevedore extensions in a Pants repo - that presumably just works. You implement stevedore extensions in a Pants repo and then you want to have stevedore find those extensions via its entrypoint metadata plugin mechanism - is that right?

proud-dentist-22844

05/19/2021, 1:37 AM

Yes. We both consume and provide the stevedore extensions in the same repo.

proud-dentist-22844

05/19/2021, 1:38 AM

theoretically, others could add their own extensions, but that doesn't matter for our tests which only use the extensions defined in-repo.

enough-analyst-54434

05/19/2021, 1:40 AM

I'm reading the stevedore docs to see if they have a manual registration method. That would be one path in tests. ExtensionManager(...).register(test_this).register(test_that_too)... etc.

enough-analyst-54434

05/19/2021, 1:41 AM

It does not look like that is supported though. Apparently entry point metadata or bust.

proud-dentist-22844

05/19/2021, 1:41 AM

Maybe not in the docs, but I just found one in the code: https://opendev.org/openstack/stevedore/src/branch/master/stevedore/extension.py#L140

proud-dentist-22844

05/19/2021, 1:44 AM

And I guess that method is on a variety of objects, not just the ExtensionManager: https://docs.openstack.org/stevedore/ocata/managers.html

proud-dentist-22844

05/19/2021, 1:46 AM

So, the pants plugin wouldn't have to inject actual setuptools entry_points info. Instead, it could build some kind of index file that can be ingested in a pytest fixture to populate the ExtensionManager.

enough-analyst-54434

05/19/2021, 1:46 AM

Yup - that would work.

proud-dentist-22844

05/19/2021, 1:47 AM

That feels better than having pytest install stuff in a fixture.

enough-analyst-54434

05/19/2021, 1:49 AM

Lots of ways to go. Could be a codegen hook and take setup.py (preferrably setup.cfg since its declarative) and output

dist-name-dist-version.dist-info/entrypoints.txt

enough-analyst-54434

05/19/2021, 1:49 AM

Then the normal stevedore extension loading via importlib.metadata.entry_points() would work just like in production.

proud-dentist-22844

05/19/2021, 1:50 AM

output that where? Inside one of the various pex files used for pytest?

enough-analyst-54434

05/19/2021, 1:52 AM

A codegen rule would output that as the output of a Process. You'd specify that output be captured and then you'd access it by its Digest. That Digest would get added in to downstream test sandboxes.

enough-analyst-54434

05/19/2021, 1:52 AM

So, in other words, codegen just produces files from other files and the codegenned files get added to downstream rule Process sandboxes.

enough-analyst-54434

05/19/2021, 1:53 AM

As normal files / code, just generated on the fly.

proud-dentist-22844

05/19/2021, 1:54 AM

hmm. So it still needs to be a separate Process, I can't do it from within the plugin itself.

enough-analyst-54434

05/19/2021, 1:55 AM

Any real work in a plugin needs to be a seperate

Process

enough-analyst-54434

05/19/2021, 1:56 AM

Some concrete examples. For protobuf. 2 pieces: 1. Look for proto files and map them to (eventual generated code) module names: https://github.com/pantsbuild/pants/blob/8e564173306e7e9ea05e13289388a9753be6a21a/[…]pants/backend/codegen/protobuf/protobuf_dependency_inference.py 2. Actually generate code from proto files: https://github.com/pantsbuild/pants/blob/0a01eaf143f71df87748ce5b5bd1d84b0b165233/src/python/pants/backend/codegen/protobuf/python/rules.py#L58

enough-analyst-54434

05/19/2021, 1:58 AM

Again - its super confusing that the rule language is python, You can't really do anything in Python in the rule body except execute `Process`es, and then manipulate Digests and look at user options or Target metadata.

proud-dentist-22844

05/19/2021, 1:59 AM

Cool. I'll try my hand and come back with questions 🙂 Sounds like we could use a special linter (or custom flake8 rules?) for pants plugins to warn when they do things other than use Process.

enough-analyst-54434

05/19/2021, 2:00 AM

It would probably be better to install one of these: https://www.python.org/dev/peps/pep-0578/ I imagine it slows things down, so if we had it we'd want to just flag it on for testing plugins, not in prod.

enough-analyst-54434

05/19/2021, 2:01 AM

For now, code review is the defense.

proud-dentist-22844

05/19/2021, 4:43 AM

I'm reusing or copying and slightly modifying a lot of the Pex EntryPoint related code for this stevedore_extensions pants plugin.

proud-dentist-22844

05/19/2021, 2:49 PM

I made a

stevedore_extension

target that defines the entry_point metadata (using a lot of logic lifted from pex stuff). Now, I need to generate that entry_points.txt file, but to do that, I need all of the

stevedore_extension

targets in a BUILD file, not just the one that triggers the

GenerateSourcesRequest

. Is there a good idiom for generating sources for multiple targets in the same rule? I imagine I could do some kind of

Get()

to get all of those targets, but then I'm concerned about each of the

GenerateSourcesRequest

saying that they conflict because they generate the same file (though the contents should be the same). Is that the way to go?

proud-dentist-22844

05/19/2021, 3:15 PM

Copy code

@rule(desc="Generate entry_points.txt from stevedore_extension target metadata", level=LogLevel.DEBUG)
async def generate_entry_points_txt_from_stevedore_extension(
    request: GenerateEntryPointsTxtFromStevedoreExtensionRequest,
) -> GeneratedSources:
    # similar to relocate_files, this isn't standard codegen.
    # It uses the metadata on targets as source instead of source files.

    sibling_targets = await Get(
        Targets, AddressSpecs([SiblingAddresses(request.protocol_target.address)]),
    )
    stevedore_targets = [
        tgt for tgt in sibling_targets if tgt.has_field(StevedoreEntryPointsField)
    ]
    resolved_entry_points = await MultiGet(
        Get(ResolvedStevedoreEntryPoints, ResolveStevedoreEntryPointsRequest(tgt.entry_points))
        for tgt in stevedore_targets
    )

    entry_points_txt_path = ""  # TODO: where does this go?
    entry_points_txt_contents = ""

    stevedore_extension: StevedoreExtension
    for i, stevedore_extension in enumerate(stevedore_targets):
        namespace: StevedoreNamespaceField = stevedore_extension[StevedoreNamespaceField]
        entry_points: StevedoreEntryPointsField = resolved_entry_points[i].val
        if not entry_points:
            continue

        entry_points_txt_contents += f"[{namespace.value}]\n"
        for entry_point in entry_points.value:
            # TODO: I need the resolved entry points here...
            entry_points_txt_contents += f"{entry_point.name} = {entry_point.value.spec}\n"
        entry_points_txt_contents += "\n"

    entry_points_txt_contents = entry_points_txt_contents.encode("utf-8")

    snapshot = await Get(
        Snapshot,
        Digest,
        CreateDigest([FileContent(entry_points_txt_path, entry_points_txt_contents)]),
    )
    return GeneratedSources(snapshot)

proud-dentist-22844

05/19/2021, 3:18 PM

So, I didn't use

Process

here because I wasn't working with files, just target metadata... wdyt? Also, where do I put the

dist-name-dist-version.dist-info/entrypoints.txt

so that it gets included in the pexes of sources that get built for pytest? Or does it just need to be alongside the code and by virtue of being on the PYTHONPATH, the dist-info gets picked up?

enough-analyst-54434

05/19/2021, 4:17 PM

Yeah, target metadata is fair game in rules. I was just trying to underscore no reading files via open, etc with the Process thing. As to the latter, yes - the idea is to get metadata on the PYTHONPATH. It would really serve you well to figure this out / get it working in a manual tmp dir sandbox you hand construct first. Then you can write rule code with confidence you have a working mechanism to aim for.

enough-analyst-54434

05/19/2021, 4:17 PM

I'm out here presently and for the next 8 days; so someone else will need to step in if you have more questions.

proud-dentist-22844

05/19/2021, 4:18 PM

ok. Thanks for all your help!

4 Views

Open in Slack

Previous Next