https://pantsbuild.org/ logo
#general
Title
# general
f

flat-zoo-31952

01/10/2022, 7:35 PM
Running into a weird caching issue when trying to pipe things:
Copy code
./pants dependees --transitive some/file.py | xargs ./pants filter --target-type=python_tests --granularity=file
If I run this once, it takes a while to compute, which I expect. But, on subsequent tries it looks like Pants usually needs to completely recompute the depedency graph every time I run it. (Well, using
-ldebug
shows that it's hitting cache for each module but it just takes a while to traverse the whole graph, even if it's using cache each time). My question is... why doesn't it just remember the final output of that recent
./pants dependees
command? Does this have something to do with pantsd?
w

witty-crayon-22786

01/10/2022, 8:14 PM
which version of Pants? there was a fix for bad behavior there recentlyish
f

flat-zoo-31952

01/10/2022, 8:42 PM
older, i'm on 2.7.1rc1
trying to bring an old PoC branch back to life, I'll upgrade to a newer version and see what happens
w

witty-crayon-22786

01/10/2022, 8:45 PM
https://github.com/pantsbuild/pants/pull/13028 … so it was actually backported i think
oh. shoot.. it should be in that release.
do you see an
Initializing scheduler
line…?
f

flat-zoo-31952

01/10/2022, 8:48 PM
yes
w

witty-crayon-22786

01/10/2022, 8:48 PM
hm, yea: that should have been fixed by #13028. are any other options changing between the two runs?
f

flat-zoo-31952

01/10/2022, 8:54 PM
nope, I can typically reproduce this just by pushing ⬆️ and [enter] immediately to just re-run the command
w

witty-crayon-22786

01/10/2022, 8:54 PM
i mean: between the two goals that are piped between one another
also, is there anything interesting in
.pants.d/pants.log
?
pantsd
could be restarting
Initializing scheduler..
indicates that
pantsd
restarted, whereas
*: reinitializing scheduler
indicates that options changed
f

flat-zoo-31952

01/10/2022, 8:56 PM
yes, pantsd seems to be restarting between these runs
w

witty-crayon-22786

01/10/2022, 8:57 PM
what does the log say?
f

flat-zoo-31952

01/10/2022, 8:59 PM
ahah
Copy code
15:55:08.28 [WARN] Completed: Find Python interpreter to bootstrap PEX - No bootstrap Python executable could be found from the option `interpreter_search_paths` in the `[python-setup]` scope. Will attempt to run PEXes directly.
15:55:16.62 [ERROR] The scheduler was invalidated: Exception('pantsd process 433454 was using 1026.92 MiB of memory (above the `--pantsd-max-memory-usage` limit of 1024.00 MiB).')
15:55:17.19 [ERROR] service failure for <pants.pantsd.service.scheduler_service.SchedulerService object at 0x7ef308554ee0>.
15:55:17.20 [INFO] Waiting for ongoing runs to complete before exiting...
from a run-tracker log
guess I can try bumping that memory limit... still, consuming 1 GB of memory seems like a lot
w

witty-crayon-22786

01/10/2022, 9:03 PM
hm… yea, that’s very surprising for only
dependees
+
filter
we’re working toward prioritizing https://github.com/pantsbuild/pants/issues/12662 soon, which will get some infrastructure in place to help triage this
👍🏻 1
f

flat-zoo-31952

01/10/2022, 9:06 PM
we have an enormous dep graph, so it would make sense that we would hit this
I bumped the memory limit, still got this problem... in fact it seems to get slower with each execution now
the interactive output also gets much glitchier and less informative
w

witty-crayon-22786

01/10/2022, 9:46 PM
hm. that’s unexpected. is the machine (vm/container) constrained in any way?
f

flat-zoo-31952

01/10/2022, 9:47 PM
no... I think I'm hitting a pantsd bug tbh, and it seems to be triggered by the filter command
I'm not really set up to try the run/test/package commands yet, so I can't rule them out as causes for this
fwiw i'm passing 1623 targets to filter
w

witty-crayon-22786

01/10/2022, 9:51 PM
should be fine… in `pantsbuild/pants`:
Copy code
$ time ./pants list :: | xargs ./pants filter --target-type=python_tests | wc -l
     131

real	0m3.786s
user	0m0.416s
sys	0m0.442s
$ time ./pants list :: | wc -l
    1412

real	0m0.788s
user	0m0.162s
sys	0m0.208s
so… yea, something is fishy.
dependees
is relatively expensive (it fully computes dependencies), but still shouldn’t take that long
and filter itself is relatively lightweight (does not compute dependencies)
f

flat-zoo-31952

01/10/2022, 9:53 PM
yeah in looking for reproducers it's specifically passing lots of targets to filter that's triggering the degrading pantsd behavior
even with just 100 targets it seems
Copy code
❯ wc -l magic-deps.txt
1623 magic-deps.txt
❯ ./pants --spec-files=(tail -n 100 magic-deps.txt | psub) filter --target-type=python_tests --granularity=file
...
This is enough to get the cache to stop working and pantsd looks like it gets slower an consumes more memory with each execution
w

witty-crayon-22786

01/10/2022, 10:00 PM
does it render a
reinitializing
line…?
the
--spec-files
option is potentially relevant, although it is not supposed to invalidate
f

flat-zoo-31952

01/10/2022, 10:01 PM
that's just the way I've gotten it to work here, the same behavior happens with xargs
Okay I think I've shrunk the reproducer down to a simple example:
Copy code
./pants filter --target_type=python_tests $target_that_is_not_a_test
After this, there seems to be a memory leak in pantsd and performance of the dependency graph computation is not pulled from cache and gets slower with each execution. Can be fixed by killing pantsd
w

witty-crayon-22786

01/10/2022, 10:41 PM
if you’re able to privately gist me the
-ldebug
output of running
filter
repeatedly, that would be helpful. there were some minor fixes in this area, and i’m wondering if they’re relevant. but it’s still unexpected 1) to use that much memory, 2) for things to get slower over time
f

flat-zoo-31952

01/11/2022, 12:28 AM
Is there a way to send log output to a file while preserving the interactive output on stderr?
nvm... I haven't been able to repro this since upgrading
anecdotally i'm noticing a lot more reliable/predictable performance from 2.9
w

witty-crayon-22786

01/12/2022, 6:53 PM
Awesome! Yea, there were some improvements for invalidation.
7 Views