I think I broke `lint` for formatters thread Pants #development

I think I broke `lint` for formatters :thread:

bitter-ability-32190

09/28/2022, 5:19 PM

I think I broke

lint

for formatters 🧵

bitter-ability-32190

09/28/2022, 5:22 PM

Still stewing on it, but

autoflake

is our first formatter, and if it makes changes, you'll see it "fail" in

lint

after converting to a

LintResult

. I'm guessing because the inputs are the same (

None

snapshot) the rule is getting memoized (whereas previously the snapshot would differ since

fmt

changed the file). ...I think For whatever reason I'm feeling a bit fuzzy on the memoization today

hundreds-father-404

09/28/2022, 5:24 PM

but autoflake is our first formatter, and if it makes changes, you'll see it "fail" in lint after converting to a LintResult.

we want that, right?

bitter-ability-32190

09/28/2022, 5:24 PM

I should say I'm running

fmt lint ::

hundreds-father-404

09/28/2022, 5:25 PM

Oh. So I think the issue is the inputs to the Autoflake rule aren't changing, maybe? After

fmt

does it's thing, we would expect the inputs to have changed from when Autoflake ran for `fmt`; thus, it reruns for

lint

bitter-ability-32190

09/28/2022, 5:26 PM

Bingo. I think thats because the type of the input went from a

FieldSet

(which I'm guessing gets invalidated? Due to maybe the source field?) to a filepath

bitter-ability-32190

09/28/2022, 5:27 PM

Thats the fuzzy part

hundreds-father-404

09/28/2022, 5:28 PM

Hm I'm not sure why the FieldSet would be any different, all we have is the raw source glob in the FieldSet

hundreds-father-404

09/28/2022, 5:29 PM

Maybe try logging the inputs to the Autoflake rule before and after? You'll want to log inside

lint.py

I think, not the autoflake rule. Use

--only=autoflake

bitter-ability-32190

09/28/2022, 5:32 PM

The FieldSet has to be the magic, otherwise the lint rules would be memoized between runs when you touch a file, right?

hundreds-father-404

09/28/2022, 5:33 PM

I might need backup here. @witty-crayon-22786 why did that work before?

👍 1

bitter-ability-32190

09/28/2022, 5:34 PM

As for the solution, I think we have to bring back the snapshot as an unconditionally already-populated field.

bitter-ability-32190

09/28/2022, 6:49 PM

That or maybe mark SubPartition as uncacheable, making the rule run, but ITS Gets are rightfully cached

witty-crayon-22786

09/28/2022, 8:13 PM

the

FieldSet

doesn’t “contain” the file: instead, consuming it and requesting sources (capturing a

Snapshot

of the sources) depends on the files on disk. when they change, the dependees are invalidated

witty-crayon-22786

09/28/2022, 8:14 PM

there isn’t any magic involved in

lint

fmt

around caching, except that the Result types are marked uncacheable for the purposes of their log messages

witty-crayon-22786

09/28/2022, 8:29 PM

but yea. i don’t fully understand the issue, but it doesn’t sound like you “broke” the actual run: you just might not see the logged side-effect of

autoflake

running twice (in

fmt

and then in

lint

), because it only actually runs once (in

fmt

)

bitter-ability-32190

09/28/2022, 8:32 PM

But since it changed the sources, it should ideally re-run in

lint

and succeed

witty-crayon-22786

09/28/2022, 8:42 PM

how does it get the new sources in

lint

witty-crayon-22786

09/28/2022, 8:42 PM

is it given a snapshot of the new sources as part of the request?

bitter-ability-32190

09/28/2022, 8:43 PM

Nope, just file paths. And I think that's the crux of the issue

witty-crayon-22786

09/28/2022, 8:44 PM

shouldn’t be. using filepaths via PathGlobs will trigger invalidation

witty-crayon-22786

09/28/2022, 8:45 PM

between

fmt

and

lint

, files will have changed on disk, and nodes should be invalidated

witty-crayon-22786

09/28/2022, 8:45 PM

if you’re not seeing that, it would be a bug.

bitter-ability-32190

09/28/2022, 8:46 PM

Yeah, there's a lot of moving parts. Try it out. I think just adding an unused import and then

fmt lint

that file. (And optionally using --only)

bitter-ability-32190

09/28/2022, 8:46 PM

lint

should succeed after rerunning autoflake

witty-crayon-22786

09/28/2022, 8:49 PM

yea, you’re right. i see the invalidation happening though:

Copy code

+ autoflake made changes.

…

Copy code

13:48:06.10 [INFO] Filesystem changed during run: retrying `Lint` in 500ms...

…

Copy code

✕ autoflake failed.

so: what is being written to disk?

witty-crayon-22786

09/28/2022, 8:50 PM

…the right thing is being written to disk. interesting.

❗ 1

witty-crayon-22786

09/28/2022, 8:52 PM

@bitter-ability-32190: do you want to revert until you’re back from vacation?

bitter-ability-32190

09/28/2022, 9:18 PM

I think making the SubPartition not cacheable might do it?

witty-crayon-22786

09/28/2022, 9:19 PM

inputs can’t really be marked uncacheable (or, they can, but only if they’re marked

SideEffecting

)

witty-crayon-22786

09/28/2022, 9:20 PM

i’m not going to have time to investigate this in the next few days, so we should probably revert until someone can

hundreds-father-404

09/28/2022, 9:23 PM

ditto that I won't have time to investigate, unfortunately

bitter-ability-32190

09/28/2022, 9:27 PM

Oh I see. Because the file changes, the pathglob invalidates, and so the subgraph gets should be invalidated before the memoization happens for the fmt SubPartition rule

witty-crayon-22786

09/28/2022, 9:34 PM

yea

bitter-ability-32190

09/28/2022, 9:38 PM

Hmm

witty-crayon-22786

09/28/2022, 9:46 PM

and so the subgraph gets should be invalidated before the memoization happens for the fmt SubPartition rule

i mean, between

fmt

and

lint

, the writing of the files to disk will invalidate a bunch of nodes out of the graph, such that they need to re-run

bitter-ability-32190

09/29/2022, 12:42 AM

So I'm "around" for an hour or so each day. I can take a look over the next few days. Id thats too long, feel free to revert (but I'd like for the reverter to be sure it "fixies" the issue and that it wasn't always underlying)

careful-address-89803

09/29/2022, 1:50 AM

I don't think your changes caused this, I tried this on the commit before yours and I still get that
./pants fmt lint
will have a formatter make changes, fail its lint, and write the correct output to disk. oh, wait, this other commit did it :(

witty-crayon-22786

09/29/2022, 1:55 AM

Thanks! Yea, a bisect would be super super helpful

bitter-ability-32190

09/29/2022, 2:10 AM

I wanna say I'm surprised, but also what the hell 😂

careful-address-89803

09/29/2022, 2:26 AM

Doing a git bisect within the PR is a bit rough, most of the early commits hit an import error. but the issue is somewhere in these changes

bitter-ability-32190

09/29/2022, 10:41 AM

I'm guessing the graph isn't being invalidated like we expect. And this worked before because we were using a snapshot which would have a different value regardless of invalidation

bitter-ability-32190

09/29/2022, 12:54 PM

Hmm could it be that the graph is being invalidated, but we never get to the invalid subgraph? Since we're passing file paths which come from FieldSets sources across the rule boundary, we haven't yet hit a pathglobs request. Everything up until the rule that runs the formatter is static. Either way, the fix is simple. We collect snapshots in lint for formatters and pass them to the SubPartition construction

bitter-ability-32190

09/29/2022, 5:15 PM

Fix was trivial: https://github.com/pantsbuild/pants/pull/17060

Open in Slack

Previous Next