I have a Python monorepo that uses pants We have tests regis Pants #general

I have a Python monorepo that uses pants. We have ...

dry-orange-3870

06/02/2023, 10:26 PM

I have a Python monorepo that uses pants. We have tests registered with

python_tests

that run locally. We're planning on writing tests that run on a remote host. This remote host environment doesn't directly support

pex

files. We don't have the ability to use the remote execution currently built into pants as that requires us to install tools that are incompatible with the remote host (it's a long story). Our current plan is to extend the test task with a custom target in an internal plugin that looks something like this: • For each resolve ◦ Figure out which tests need to run based on Pants cache misses ◦ Figure out if there's a remote host running that already has the external python libraries we need. If not, start a remote host and install the libraries we need. ◦ Ship a

.pex

file to the remote host ◦ Unzip the

pex

and add the source files to the path ◦ From a

@rule

run

pytest

on the remote host and return a Pants compatible

TestResult

How I should go about doing this? We've made a pretty good dent in this with our current approach. But today, we tried using

register_plugin_field

as well as extending

PythonTestTarget

and in both cases we ran into issues where our tests ran twice--once for the built-in pytest runners and once for our remote pytest runner (which isn't what we want). Any tips on what we're potentially doing wrong here would also be appreciated

happy-kitchen-89482

06/03/2023, 5:39 AM

So that

.pex

file would only contain sources, no third-party deps?

happy-kitchen-89482

06/03/2023, 5:40 AM

The third-party deps would be provided by the runtime on the remote host?

happy-kitchen-89482

06/03/2023, 5:40 AM

In fact, it wouldn't even need to be a

.pex

file, it sounds like, just any old .zip file would do, but

.pex

happens to be convenient?

dry-orange-3870

06/03/2023, 9:28 AM

Yep, that's accurate

dry-orange-3870

06/03/2023, 9:29 AM

We could just use a

.zip

dry-orange-3870

06/03/2023, 9:31 AM

We've made a pretty good dent in this with our current approach. But today, we tried using
register_plugin_field
as well as extending
PythonTestTarget
and in both cases we ran into issues where our tests ran twice--once for the built-in pytest runners and once for our remote pytest runner (which isn't what we want). Any tips on what we're potentially doing wrong here would also be appreciated

Any input on this part? I'll try to clean things up into a tidy code snippet on Monday if that helps.

dry-orange-3870

06/03/2023, 1:37 PM

Copy code

from dataclasses import dataclass
from typing import Any
from pants.backend.python.goals.pytest_runner import PyTestRequest
from pants.backend.python.target_types import PythonTestTarget, PythonTestsGeneratorTarget, PythonTestsGeneratingSourcesField, PythonTestsOverrideField, PythonFilesGeneratorSettingsRequest, SkipPythonTestsField
from pants.backend.python.subsystems.pytest import PythonTestFieldSet, PyTest
from pants.backend.python.subsystems.setup import PythonSetup
from pants.backend.python.subsystems.python_tool_base import PythonToolBase
from pants.backend.python.target_types import PythonTestSourceField, PythonTestsDependenciesField, _PYTHON_TEST_MOVED_FIELDS
from pants.core.goals.test import Partitions, TestFieldSet, TestRequest, TestResult, TestSubsystem, ShowOutput, TestDebugAdapterRequest, TestDebugRequest
from pants.engine.fs import EMPTY_FILE_DIGEST
from pants.engine.rules import rule, collect_rules
from pants.engine.target import (
    COMMON_TARGET_FIELDS,
    Dependencies,
    BoolField,
    IntField,
    SingleSourceField,
    Target,
    TargetFilesGenerator
)
from pants.engine.unions import UnionRule

from pants.option.option_types import SkipOption
from pants.option.subsystem import Subsystem

# adding one field to differentiate our target from the python_test and python_tests target
class RunOnRemoteField(BoolField):
     alias = "run_on_remote"
     help = "Whether to run the test on remote."
     default = False
     required = True

# extending target instead of PythonTestTarget as a step to try and elminate issues
class RemotePythonTestTarget(Target):
    alias = "remote_python_test"
    help = "Tests run on remote."
    core_fields = (
        *COMMON_TARGET_FIELDS,
        *_PYTHON_TEST_MOVED_FIELDS,
        PythonTestsDependenciesField,
        PythonTestSourceField,
        RunOnRemoteField
    )

class RemotePythonTestsGeneratorTarget(TargetFilesGenerator):
    alias = "remote_python_tests"
    core_fields = (
        *COMMON_TARGET_FIELDS,
        PythonTestsGeneratingSourcesField,
        PythonTestsOverrideField,
        RunOnRemoteField
    )
    generated_target_cls = RemotePythonTestTarget
    copied_fields = (*COMMON_TARGET_FIELDS,RunOnRemoteField,)
    moved_fields = _PYTHON_TEST_MOVED_FIELDS
    settings_request_cls = PythonFilesGeneratorSettingsRequest
    help = "Generate a `remote_python_test` target for each file in the `sources` field."

# Our field set is the same as PythonTestFieldSet, but with one additional field.
@dataclass(frozen=True)
class RemotePythonTestFieldSet(PythonTestFieldSet):
    required_fields = (PythonTestSourceField, RunOnRemoteField)
    run_on_remote: RunOnRemoteField

    @classmethod
    def opt_out(cls, tgt: Target) -> bool:
        return tgt.get(SkipPythonTestsField).value or not tgt.get(RunOnRemoteField).value

# We're not going to be running pytest locally, so we extend the default Subsystem instead of the Pytest subsystem (although we tried that and it too didn't make a difference)
class RemotePythonTestSubsystem(Subsystem):
    name = "Remote Python Tests"
    options_scope = "example-test"
    help = "Tests run on remote."
    skip = SkipOption("skip_remote_tests")


# Define our test request
@dataclass(frozen=True)
class RemotePythonTestRequest(TestRequest):
    field_set_type = RemotePythonTestFieldSet
    tool_subsystem = RemotePythonTestSubsystem

# somehow, this gets called twice
# It gets called once with our valid RemotePythonTestFieldSet
# but it also gets called a second time with a PythonTestFieldSet entry. This shouldn't happen because RemotePythonTestFieldSet doesn't
# inherit from PythonTestFieldSet
@rule
async def run_remote_tests(
    batch: RemotePythonTestRequest.Batch[RemotePythonTestFieldSet, Any],
) -> TestResult:

    print(f"batch is: {batch}")

    return TestResult(
        exit_code=None,
        stdout="",
        stderr="",
        stdout_digest=EMPTY_FILE_DIGEST,
        stderr_digest=EMPTY_FILE_DIGEST,
        addresses=tuple(field_set.address for field_set in batch.elements),
        output_setting=ShowOutput.ALL,
        result_metadata=None,
        partition_description=batch.partition_metadata.description,
    )


def rules():
    return [
        # Originally, we tried just re-using the existing targets by adding a field, but abandoned that approach
        # when we saw run_remote_tests was getting called twice
        # PythonTestTarget.register_plugin_field(RunOnRemoteField),
        # PythonTestsGeneratorTarget.register_plugin_field(RunOnRemoteField),
        # UnionRule(TestFieldSet, RemotePythonTestFieldSet),
        *RemotePythonTestRequest.rules(),
        *collect_rules(),
    ]

dry-orange-3870

06/03/2023, 1:39 PM

That's

stuff.py

where we're defining our rules and targets.

register.py

looks like this:

Copy code

from remote_cluster_plugin.remote_integration_test import stuff


def rules():
    return [ *stuff.rules()]


def target_types():
    return [stuff.RemotePythonTestTarget, stuff.RemotePythonTestsGeneratorTarget]

dry-orange-3870

06/03/2023, 1:41 PM

We have one project with a

remote_python_tests

target in its

BUILD

. It has one test file in

integration_tests

, but somehow

run_remote_tests

is being called twice for that target. In one case, the batch has the aforementioned target as the sole element as type

RemotePythonTestFieldSet

. The second time the rule is called, it's the same address, but it's of type

PythonTestFieldSet

which shouldn't be possible

dry-orange-3870

06/03/2023, 1:41 PM

FWIW, we're using

2.16.0rc4

. Any ideas on what's causing this bug?

enough-analyst-54434

06/03/2023, 2:25 PM

I think you can step away from all the details of Pants and just focus on the word set here. The fundamental problem is the builtin rules operate on a certain set. Your new stuff creates a superset. Unless you can come up with a disjoint set you're doomed without bigger surgery to how things work.

enough-analyst-54434

06/03/2023, 2:26 PM

That's my guess anyhow.

enough-analyst-54434

06/03/2023, 2:30 PM

Disjoint or subset. Basically if you subtract one field the builtin rules need, then they shouldn't run, just yours will. The problem may be that you really do need all those fields. Perhaps take a simple one and duplicate it with a new name. That would be enough I'd think.

dry-orange-3870

06/03/2023, 6:18 PM

That makes sense, I'll give that a shot as a work around. Out of curiosity though, why is my

run_remote_tests

rule being called with elements of type

PythonTestFieldSet

when it clearly requires

RemotePythonTestFieldSet

? Is the type hint ignored? (I seem to recall seeing something about that in the docs) I assumed that the type

RemotePythonTestRequest.Batch

would be sufficient to avoid this

enough-analyst-54434

06/03/2023, 7:00 PM

That I have no clue about. I haven't been in the details here in a while.

happy-kitchen-89482

06/03/2023, 10:09 PM

Hmm that is strange, even if this were purely duck-typed, that

RunOnRemoteField

is not available on

PythonTestFieldSet

, so it shouldn't trigger

happy-kitchen-89482

06/03/2023, 10:10 PM

Do you know how to run Pants from sources for debugging?

dry-orange-3870

06/05/2023, 11:16 PM

Do you know how to run Pants from sources for debugging?

No, but I don't think that will be necessary.

PythonTestFieldSet

has one required field

PythonTestSourceField

(or something like that). We made our own

PythonSourceField

class which is just like

PythonTestSourceField

, but with a different name. Then we made our own sources target which uses our source field. This allows the

pants test

goal to still run our

@rule

as part of testing, but prevents the standard pytest rule from picking it up

dry-orange-3870

06/05/2023, 11:17 PM

Hmm that is strange, even if this were purely duck-typed, that
RunOnRemoteField
is not available on
PythonTestFieldSet
, so it shouldn't trigger

Yeah, I agree. I think this is a pants bug. We've worked around it for now. Now that I think I understand things better, this is what we should have done initially, but 1. that wasn't immediately clear 2. this bug made us think we were simply doing something wrong

happy-kitchen-89482

06/06/2023, 12:33 PM

Still, we'd like to fix the bug, if it is one!

2 Views

Open in Slack

Previous Next