I think I’m seeing a perf regression for `lint` in...
# development
s
I think I’m seeing a perf regression for
lint
in 2.15.0a0 but I’m unsure how to measure it for a report - anyone have tips?
f
since you are likely using toolchain buildsense, the workunit trace uploaded to buildsense would have performance data
s
unfortunately the impact of the regression is that
lint
freezes up / doesn’t finish 😬
will try to find one where trace data still gets uploaded - so far everything’s had no data because it gets OOM(?) killed before upload
f
it's uploaded by the toolchain plugin. if you disable the plugin, does the freeze still occur?
s
yes
when things lock up, my console shows a lot of
Run Black on _ files
and
Lint using Pylint
messages as if those are WIP, but
ps ax | grep python
shows nothing running
I see some zombie
python3
processes with pantsd as the parent 🤔
hmmmm
py-spy
points at the
RequirementsPexRequest
made by
pylint
f
if you restart pantsd (just kill it), does the freeze occur on the next run?
s
yeah, it keeps happening
I dropped my lint batch-size down and that got black and isort to run
but pylint continues to get stuck
f
I'm not remembering the name of the option, but I recall there is an option that sets whether the full resolve pex is used or whether Pants subsets it.
I'd like to know if it is set or not.
Assuming I can remember the name.
s
I’m pretty sure we are using the subsetting
the only non-idle thread according to py-spy has current stack:
Copy code
Thread 0x700008C41000 (active+gil)
    <genexpr> (engine/target.py:538)
    _find_registered_field_subclass (engine/target.py:535)
    _maybe_get (engine/target.py:550)
    get (engine/target.py:597)
    <genexpr> (backend/python/util_rules/python_sources.py:91)
    __init__ (core/util_rules/source_files.py:42)
    new_init (util/meta.py:164)
    prepare_python_sources (backend/python/util_rules/python_sources.py:90)
    native_engine_generator_send (engine/internals/selectors.py:593)
f
ah its name is
--python-run-against-entire-lockfile
s
confirmed we aren’t setting that
f
can you add some debugging code to the lint batching code? I'd like to know how big of a batch it is trying to make.
s
yeah I can give that a try - I suspect a huge batch because our code is so tangled and pylint wants the transitive deps
Copy code
lease_files_in_graph (engine/internals/scheduler.py)
is pretty high in the
py-spy top
output
s
before I kill the running pantsd to add the debugging, last observation: looks like a lot of time being spent here: https://github.com/pantsbuild/pants/blob/15516a474153f8818869b58791b036567438aadf/src/python/pants/core/util_rules/source_files.py#L51-L61
f
I can imagine that taking a while if the number of source
Digest
's to merge together has a lot of data.
s
yeah
f
or some other traversal taking a while
s
right now the batching code allows up to 4x the configured
[lint].batch_size
to land in a single lint run - maybe we should tighten that up
f
`Digest`s should be relative cheap in and of themselves since they just reference other digeests
I'm curious to know how many elements are in `request.sources_fields`in
determine_source_files
s
me 2 - getting that set up now, have to fix some bitrot with my
pants_from_sources
setup
s
not seeing any crazy-large batches, biggest one so far is 385
(385 at the top level of the
run_pytest
rule - still waiting for execution to hit the log I added in
determine_source_files
😬 )
ah
Copy code
11:47:55.74 [INFO] Merging 11010445 source fields
6 batches of ^^^ so far
so 11010445 elements in
request.sources_fields
in multiple concurrent pylint runs
f
I would love to know if those were all unique. can you de-dupe that list?
fields should be hashable so I imagine
set(request.sources_fields)
will do just fine
👍 1
(and maybe use
OrdededSet
so that the de-dupe doesn't introduce non-determinism)
s
deduping gives:
Copy code
12:19:50.41 [INFO] Merging 12728 source fields
f
Any improved performance?
s
no, still struggling
oh wait - I only deduped for the logs 🤦‍♂️
trying deduped everywhere…
yeah, deduped still slow / locked up
🤔 I have an idea for why the perf has changed from 2.14->2.15, will keep playing with changes
yeah, with my local change
./pants lint ::
actually completes on my repo again 🙂 will set up a PR