< rhythmic glass 66959> I think Pants doesn t generate repor Pants #general

<@U03PHS2TJLV> I think Pants doesn't generate repo...

happy-kitchen-89482

01/19/2023, 8:20 PM

@rhythmic-glass-66959 I think Pants doesn't generate reports by default, so presumably you're setting cli args for bandit and pylint (and flake8 for that matter) to make the emit reports? Can you share your config?

rhythmic-glass-66959

01/19/2023, 8:24 PM

Sure:

Copy code

[bandit]
args = ["-f", "json", "-o", "reports/bandit-report.json"]

[flake8]
args = ["--output-file=reports/flake8-report.txt"]

[pylint]
args = ["--output-format=text:reports/pylint-report.txt"]

happy-kitchen-89482

01/19/2023, 8:27 PM

That seems right to me, and Pants creates the

reports

subdir in the sandbox and captures its output into dist.

happy-kitchen-89482

01/19/2023, 8:27 PM

Ah, yes, the generic lint goal does disambiguation under that dir: https://github.com/pantsbuild/pants/blob/b8d46b30f28dd6650d5ab874b535f73c87b89d07/src/python/pants/core/goals/multi_tool_goal_helper.py#L135

happy-kitchen-89482

01/19/2023, 8:28 PM

So the problem is that we have two different partitions with the same description

happy-kitchen-89482

01/19/2023, 8:29 PM

Why is this being partitioned at all, I wonder

happy-kitchen-89482

01/19/2023, 8:29 PM

partitioning should be by interpreter constraints, but these partitions use the same constraints, it looks like

rhythmic-glass-66959

01/19/2023, 8:30 PM

Yep.

Copy code

[python]
interpreter_constraints = ["==3.9.13"]

sparse-lifeguard-95737

01/19/2023, 8:33 PM

@happy-kitchen-89482 this is happening because we have a max-batch-size on lint processes. so even though there’s only 1 partition, the files within that partition get divided up again

happy-kitchen-89482

01/19/2023, 8:33 PM

Oh, this may be because of batching

sparse-lifeguard-95737

01/19/2023, 8:33 PM

coke

happy-kitchen-89482

01/19/2023, 8:34 PM

Yeah, we should be generating a single report for all batches in a partition, if that is possible

sparse-lifeguard-95737

01/19/2023, 8:34 PM

in the

test

goal we avoid this situation by naming any generated report files after a target in the generating batch (taking as an invariant that each target will be in 1 batch only)

happy-kitchen-89482

01/19/2023, 8:34 PM

Or at least, that is what the user expects

happy-kitchen-89482

01/19/2023, 8:34 PM

that may not be possible

happy-kitchen-89482

01/19/2023, 8:35 PM

Yeah, so to mitigate we need to give more sensible names to the partitions I guess

happy-kitchen-89482

01/19/2023, 8:35 PM

To the batches, rather

happy-kitchen-89482

01/19/2023, 8:35 PM

or at least, number them, instead of adding that

suffix

sparse-lifeguard-95737

01/19/2023, 8:36 PM

if we can make the individual report files uniquely-named across batches, then (I believe) we could write them all into a directory named after the partition (minus any `_`s)

sparse-lifeguard-95737

01/19/2023, 8:36 PM

you’d still end up with a bunch of files though…

happy-kitchen-89482

01/19/2023, 8:36 PM

True true!

happy-kitchen-89482

01/19/2023, 8:36 PM

But some tools at least may have a way to merge them

happy-kitchen-89482

01/19/2023, 8:36 PM

like codecov does for coverage files

happy-kitchen-89482

01/19/2023, 8:37 PM

That is probably the way to go - unique filenames within the partition's dir

rhythmic-glass-66959

01/19/2023, 8:37 PM

Hum, I'm wondering, we recently added a git submodule, could this cause the issue?

happy-kitchen-89482

01/19/2023, 8:37 PM

I like that

sparse-lifeguard-95737

01/19/2023, 8:37 PM

@rhythmic-glass-66959 I don’t think so

happy-kitchen-89482

01/19/2023, 8:37 PM

That may have added enough new files that it caused you to spill over into a second batch

sparse-lifeguard-95737

01/19/2023, 8:37 PM

the partitioning logic for

lint

was refactored internally in 2.15.x

sparse-lifeguard-95737

01/19/2023, 8:37 PM

so if you’re upgrading from 2.14.x that would explain the behavior difference to me

rhythmic-glass-66959

01/19/2023, 8:37 PM

👍

happy-kitchen-89482

01/19/2023, 8:37 PM

You could work around this by increasing the default batch size I suppose 🙂

rhythmic-glass-66959

01/19/2023, 8:39 PM

Trying right now...

rhythmic-glass-66959

01/19/2023, 8:47 PM

Setting the

[lint].batch_size = 4096

seems to fix the problem.

rhythmic-glass-66959

01/19/2023, 9:01 PM

Are there any potential issues running such a large batch_size? I'm far from the default (128)...

happy-kitchen-89482

01/19/2023, 9:24 PM

I don't foresee problems, it's a performance tradeoff

👍 1

rhythmic-glass-66959

01/19/2023, 9:54 PM

Oh, I can see that. With default batch size, the lint goal takes 26s to complete. With 4096, it takes 2m18s 😞

happy-kitchen-89482

01/19/2023, 11:31 PM

Hmm, I wouldn't have expected that big a discrepancy, interesting

rhythmic-glass-66959

01/19/2023, 11:42 PM

I think I had something in cache the first time. It takes ~1m now for both scenarios (128 and 4096).

rhythmic-glass-66959

01/19/2023, 11:45 PM

Yeah, definitely something with the cache, ~10s now for both! 🤯

happy-kitchen-89482

01/20/2023, 1:41 AM

Excellent!

Open in Slack

Previous Next