Is there a way to globally shut off python depende...
# general
f
Is there a way to globally shut off python dependency inference? I thought there was one, but I only see it available https://www.pantsbuild.org/docs/reference-python-infer
b
The
imports
option controls it globally (at this point it's a bit of a misnomer, and likely should be renamed). IIRC
f
[python-infer].imports ?
b
Yeah double check the code, but I think that's it
f
I tried that and running
pants dependencies ::
got at least 2x slower
We're hitting some kinda performance bug that isn't related to actually scanning the modules I think
lmao
Copy code
Executed in  120.79 secs      fish           external
   usr time  699.66 millis  513.00 micros  699.14 millis
   sys time  138.12 millis  134.00 micros  137.99 millis
I think there's a scheduler problem haha
120 sec but < 2 sec of actual work?
b
Can I ask why you wanna turn it off? Just curious
f
I don't, I"m trying to isolate why we get such abysmal performance with it
and I think I need to use
--no-pantsd
to get accurate metrics with
time
Copy code
❯ time ./pants --no-pantsd --no-python-infer-imports dependencies ::
...
________________________________________________________
Executed in  130.08 secs    fish           external
   usr time  171.09 secs  914.00 micros  171.09 secs
   sys time   21.74 secs    0.00 micros   21.74 secs
b
How many files do you have?
f
Copy code
❯ ./pants --no-pantsd --no-python-infer-imports list :: | wc -l
7773
b
I think a
find
for
.py
might be a bit more accurate
f
4465
b
Oh yeah. That's 4.5k processes then
(in the cold case)
If the cache is hot I wouldn't expect that to take 2 minutes through
f
The cache should be hot though. Without pantsd I know you loose memoization but it should be able to read the infered deps from the lmdb cache, right?
b
Correct
f
and turning it off shouldn't make it slower lol
b
Well I might be wrong there. It might do some other thing if off
I'd look at workunits and/debug debug traces
f
how to look at those? I'm looking at
-ldebug
now, and I've grabbed some profiler data for similar issues before
b
I'm about to hop off, so I'll let someone else pick this up, like @witty-crayon-22786
w
@flat-zoo-31952: um, without being able to capture workunits, enabling
-ldebug
might be the best bet. we probably ought to add a generic workunit capture plugin, since it has all of this timing data.
f
Copy code
17:40:00.11 ESC[32m[DEBUG]ESC[0m Completed: Find targets from input specs
17:40:57.82 [INFO] Long running tasks:
  60.41s        `dependencies` goal
17:41:27.84 [INFO] Long running tasks:
  90.43s        `dependencies` goal
17:41:56.75 ESC[32m[DEBUG]ESC[0m Completed: `dependencies` goal
not a lot of granular info there
w
and/or a
py-spy
trace
f
Yeah I think that's where we'll have to go with this. I should make a GH issue too.
cc @fresh-cat-90827
b
Does
-ltrace
give anything useful?
b
@witty-crayon-22786 I have one locally. I should just upstream it 🙈
👍 1
f
-ltrace gave me a 170 MiB log
b
170MiB of pants gold... yeah, maybe not super useful.
c
Is there any instrumentation I can provide from
shorts
to help in comparing performance?
f
I think I'm hitting a Pants-specific bug, since it got worse with dependency inference turned off
https://github.com/pantsbuild/pants/issues/18911 ... I've included flamegraph and speedscope data
i've kept digging but it really looks like a scheduler issue to me from where I'm sitting. With a cold cache it takes 4-5 min to perform inference. I'm happy to keep looking into what could be causing this, because it seems outlandishly slow