I m trying to write a formatter that does some very simple a Pants #plugins

I'm trying to write a formatter that does some ver...

bitter-ability-32190

02/23/2022, 5:39 PM

I'm trying to write a formatter that does some very simple addition (copyright header) to the file contents. With the following code, Pants always says my subsystem made changes (likely my output digest isn't right)

Copy code

source_files = await Get(
        SourceFiles,
        SourceFilesRequest(field_set.source for field_set in request.field_sets),
    )
    source_files_snapshot = (
        source_files.snapshot
        if request.prior_formatter_result is None
        else request.prior_formatter_result
    )
    input_digest = source_files_snapshot.digest
    digest_contents = await Get(DigestContents, Digest, input_digest)
    output_digest = await Get(
        Digest,
        CreateDigest(
            FileContent(path=file_content.path, content=maybe_add_copyright(file_content.content))
            for file_content in digest_contents
        ),
    )
    return FmtResult(
        input=input_digest,
        output=output_digest,
        stdout="",
        stderr="",
        formatter_name=request.name,
    )

hundreds-father-404

02/23/2022, 5:42 PM

maybe the difference of

/n

? Indeed it calculates if it was changed based on comparing input vs output. So you could materialize to mem w/

DigestContents

and compare that way

fast-nail-55400

02/23/2022, 5:44 PM

You may want to use these debug helpers I wrote for myself a while ago to see what the difference is:

Copy code

def diff_fmt_result(rule_runner: RuleRunner, fmt_result: FmtResult) -> None:
    input_digest_contents = {fc.path: fc for fc in rule_runner.request(DigestContents, [fmt_result.input])}
    output_digest_contents = {fc.path: fc for fc in rule_runner.request(DigestContents, [fmt_result.output])}
    for path, input_fc in input_digest_contents.items():
        output_fc = output_digest_contents[path]
        input_content = input_fc.content.decode().splitlines()
        output_content = output_fc.content.decode().splitlines()
        unidiff = "\n".join(difflib.unified_diff(input_content, output_content, lineterm = ""))
        print(f"DIFF for {path}:\n{unidiff}")

def diff_fmt_result(rule_runner: RuleRunner, fmt_result: FmtResult) -> None:
    input_digest_contents = {
        fc.path: fc.content for fc in rule_runner.request(DigestContents, [fmt_result.input])
    }
    input_digest_entries = rule_runner.request(DigestEntries, [fmt_result.input])
    print(f"input_digest_contents = {input_digest_contents}")
    print(f"input entries = {input_digest_entries}")
    print(f"input files = {','.join(sorted(input_digest_contents.keys()))}")

    output_digest_contents = {
        fc.path: fc.content for fc in rule_runner.request(DigestContents, [fmt_result.output])
    }
    output_digest_entries = rule_runner.request(DigestEntries, [fmt_result.output])
    print(f"output_digest_contents = {output_digest_contents}")
    print(f"output entries = {output_digest_entries}")
    print(f"output files = {','.join(sorted(output_digest_contents.keys()))}")

    for path, input_fc in input_digest_contents.items():
        output_fc = output_digest_contents[path]
        input_content = input_fc.decode().splitlines()
        output_content = output_fc.decode().splitlines()
        unidiff = "\n".join(difflib.unified_diff(input_content, output_content, lineterm=""))
        print(f"DIFF for {path}:\n{unidiff}")

🙌 1

bitter-ability-32190

02/23/2022, 5:45 PM

Copy code

logger.error(f"{[file_content.path for file_content in digest_contents if file_content.content != maybe_add_copyright(file_content.content)]}")

bitter-ability-32190

02/23/2022, 5:45 PM

It has empty lists each time

fast-nail-55400

02/23/2022, 5:46 PM

also what does your

maybe_add_copyright

function look like?

bitter-ability-32190

02/23/2022, 5:46 PM

Copy code

def maybe_add_copyright(content: bytes) -> bytes:
    if not has_copyright(content):
        return COPYRIGHT_HEADER.encode() + content
    return content

bitter-ability-32190

02/23/2022, 5:46 PM

🙂

hundreds-father-404

02/23/2022, 5:47 PM

try printing the input digest and output digest. Check if either the hash and/or size are differnet

hundreds-father-404

02/23/2022, 5:47 PM

I also wonder if is_executable is at play, like perms of the file

fast-nail-55400

02/23/2022, 5:48 PM

and the output from the

diff_fmt_result

that I pasted might be useful. (although you need to add QueryRule’s to your RuleRunner for that calls it makes.)

bitter-ability-32190

02/23/2022, 5:55 PM

It's either a cache issue or exec perms as Eric said. Running with the input's contents has the same output

hundreds-father-404

02/23/2022, 5:56 PM

have you tried running on a single trivial file? that reduces the risk of exec perms being the issue

bitter-ability-32190

02/23/2022, 5:56 PM

Yup, always passes 🙂

bitter-ability-32190

02/23/2022, 5:56 PM

So likely a cache issue?

bitter-ability-32190

02/23/2022, 5:57 PM

At the risk of nuking my PEXs which dir holds the rule cache?

fast-nail-55400

02/23/2022, 5:59 PM

is pantsd enabled? if so, just do

--no-pantsd

to avoid the rule memoization

hundreds-father-404

02/23/2022, 5:59 PM

run with

--no-local-cache --no-pantsd

bitter-ability-32190

02/23/2022, 6:01 PM

ooooh no cache has the same behavior!

bitter-ability-32190

02/23/2022, 6:01 PM

spicy

🌶️ 1

bitter-ability-32190

02/23/2022, 6:09 PM

Copy code

is_executable=os.stat(file_content.path).st_mode % 2 == 1

bitter-ability-32190

02/23/2022, 6:09 PM

That worked

fast-nail-55400

02/23/2022, 6:10 PM

except that directly access the filesystem outside of view of the engine’s core rules

👍 1

fast-nail-55400

02/23/2022, 6:11 PM

maybe just copy

is_executable

over from the original

FileContent

fast-nail-55400

02/23/2022, 6:12 PM

dataclasses.replace(file_content, content=maybe_add_copyright(file_content.content))

🙌 2

fast-nail-55400

02/23/2022, 6:12 PM

that will preserve all fields except

content

file_content

bitter-ability-32190

02/23/2022, 6:14 PM

oh duh lol

bitter-ability-32190

02/23/2022, 6:17 PM

AWESOME!

❤️ 1

bitter-ability-32190

02/23/2022, 7:40 PM

Admittedly, this is a bit of a thorn though. Why does the exec flag matter in this context? (I assume it doesn't but does matter if a plugin is trying to exec some file in the chroot?)

hundreds-father-404

02/23/2022, 7:41 PM

If the input digest said something was executable, and your output digest is now saying the file is not executable, that matters. They're different things. If you didn't handle this the right way, it would be like Pants is running

chmod -x

when calling

Workspace.write_digest()

✅ 1

fast-nail-55400

02/23/2022, 7:42 PM

recall that Remote Execution API is content-addressed which means anything that changes the hash of a protobuf changes the digest

👍 1

fast-nail-55400

02/23/2022, 7:43 PM

FileContent

is eventually turned into a REAPI

FileNode

which has

is_executable

as a field. different value => different digest

bitter-ability-32190

02/23/2022, 7:45 PM

Out of curiosity, why does the rest of the mode not matter?

fast-nail-55400

02/23/2022, 7:46 PM

because the proto only stores

is_executable

and none of the rest of the mode

fast-nail-55400

02/23/2022, 7:46 PM

https://github.com/pantsbuild/pants/blob/6037e699b881011f845b5f7ad109fa8eae39cc3b/[…]ote-apis/build/bazel/remote/execution/v2/remote_execution.proto

fast-nail-55400

02/23/2022, 7:50 PM

(technically Pants could store the full file mode in the

NodeProperties

proto, but Pants does not do that)

4 Views

Open in Slack

Previous Next