<@U03PHS2TJLV> I think Pants doesn't generate repo...
# general
h
@rhythmic-glass-66959 I think Pants doesn't generate reports by default, so presumably you're setting cli args for bandit and pylint (and flake8 for that matter) to make the emit reports? Can you share your config?
r
Sure:
Copy code
[bandit]
args = ["-f", "json", "-o", "reports/bandit-report.json"]

[flake8]
args = ["--output-file=reports/flake8-report.txt"]

[pylint]
args = ["--output-format=text:reports/pylint-report.txt"]
h
That seems right to me, and Pants creates the
reports
subdir in the sandbox and captures its output into dist.
So the problem is that we have two different partitions with the same description
Why is this being partitioned at all, I wonder
partitioning should be by interpreter constraints, but these partitions use the same constraints, it looks like
r
Yep.
Copy code
[python]
interpreter_constraints = ["==3.9.13"]
s
@happy-kitchen-89482 this is happening because we have a max-batch-size on lint processes. so even though there’s only 1 partition, the files within that partition get divided up again
h
Oh, this may be because of batching
s
coke
h
Yeah, we should be generating a single report for all batches in a partition, if that is possible
s
in the
test
goal we avoid this situation by naming any generated report files after a target in the generating batch (taking as an invariant that each target will be in 1 batch only)
h
Or at least, that is what the user expects
that may not be possible
Yeah, so to mitigate we need to give more sensible names to the partitions I guess
To the batches, rather
or at least, number them, instead of adding that
_
suffix
s
if we can make the individual report files uniquely-named across batches, then (I believe) we could write them all into a directory named after the partition (minus any `_`s)
you’d still end up with a bunch of files though…
h
True true!
But some tools at least may have a way to merge them
like codecov does for coverage files
That is probably the way to go - unique filenames within the partition's dir
r
Hum, I'm wondering, we recently added a git submodule, could this cause the issue?
h
I like that
s
@rhythmic-glass-66959 I don’t think so
h
That may have added enough new files that it caused you to spill over into a second batch
s
the partitioning logic for
lint
was refactored internally in 2.15.x
so if you’re upgrading from 2.14.x that would explain the behavior difference to me
r
👍
h
You could work around this by increasing the default batch size I suppose 🙂
r
Trying right now...
Setting the
[lint].batch_size = 4096
seems to fix the problem.
Are there any potential issues running such a large batch_size? I'm far from the default (128)...
h
I don't foresee problems, it's a performance tradeoff
👍 1
r
Oh, I can see that. With default batch size, the lint goal takes 26s to complete. With 4096, it takes 2m18s 😞
h
Hmm, I wouldn't have expected that big a discrepancy, interesting
r
I think I had something in cache the first time. It takes ~1m now for both scenarios (128 and 4096).
Yeah, definitely something with the cache, ~10s now for both! 🤯
h
Excellent!