fresh-mechanic-68429
09/28/2025, 6:06 PMinfer_js_source_dependencies is taking ~30 seconds to run when I'm running a single test (no pantsd). I think hydrate_sources is the bottleneck but I'm having a hard actually verifying that. My py-spy profile is mostly empty for some reason and profiling in OSX's Instruments shows time spent in the python interpreter, but I'm don't know how to get pystack frames out of it.
Looking for advice on:
⢠How should I approach profiling this? The number of async rules being invoked makes it a bit tricky
⢠Is hydrate_sources necessary? Does the python dependency inference have the same issue?
Some size info, there are ~1k infer_js_source_dependencies calls (I think thats about the size of the transitive sources for the command I ran. total sources in repo is larger)
https://github.com/pantsbuild/pants/blob/1aeddf653e99f4b5404a0627ed3e135574210044/[ā¦]c/python/pants/backend/javascript/dependency_inference/rules.pywide-midnight-78598
09/28/2025, 9:28 PMwide-midnight-78598
09/28/2025, 9:28 PMfresh-mechanic-68429
09/28/2025, 9:30 PMwide-midnight-78598
09/28/2025, 9:32 PMwide-midnight-78598
09/28/2025, 9:35 PMwide-midnight-78598
09/28/2025, 9:37 PMfresh-mechanic-68429
09/28/2025, 9:38 PMwide-midnight-78598
09/28/2025, 9:41 PMparse_javascript_deps part seems pretty straight through to rust though - with the python indirection. So I would have thought (without any benchmarking or debugging) that the time shown there is pretty much just parse timewide-midnight-78598
09/28/2025, 9:42 PMwide-midnight-78598
09/28/2025, 9:43 PMfresh-mechanic-68429
09/28/2025, 9:48 PMwide-midnight-78598
09/28/2025, 9:53 PMwide-midnight-78598
09/28/2025, 9:53 PMwide-midnight-78598
09/28/2025, 9:59 PMfresh-mechanic-68429
09/28/2025, 10:03 PMfast-nail-55400
09/28/2025, 11:55 PMHow should I approach profiling this? The number of async rules being invoked makes it a bit trickyYou can use https://github.com/shoalsoft/shoalsoft-pants-opentelemetry-plugin to send workunit tracing data to any OpenTelemetry-compatible system including a local instance of Jaeger.
fast-nail-55400
09/28/2025, 11:56 PMfresh-mechanic-68429
09/29/2025, 1:25 AM[shoalsoft-opentelemetry]
enabled = true
exporter_traces_endpoint = "<http://localhost:4318/v1/traces>"fresh-mechanic-68429
09/29/2025, 1:43 AMfresh-mechanic-68429
09/29/2025, 2:04 AM--streaming-workunits-level=trace there are ~28k generator spans, which seems like it could be related to the problemfresh-mechanic-68429
09/29/2025, 2:39 AMwide-midnight-78598
09/29/2025, 2:46 AMwide-midnight-78598
09/29/2025, 3:11 AMwide-midnight-78598
09/29/2025, 3:55 AMwide-midnight-78598
09/29/2025, 3:56 AMwide-midnight-78598
09/29/2025, 4:01 AMwide-midnight-78598
09/29/2025, 3:02 PMwide-midnight-78598
09/29/2025, 4:55 PM_determine_import_from_candidates https://github.com/pantsbuild/pants/blob/c45acc27e3d6886e2628be3905e400134fa69f40/[ā¦]c/python/pants/backend/javascript/dependency_inference/rules.py
There is A LOT of stuff that happens in that function, including duplicate work, when you start jumping into those awaited calls. And in the contrived example, it's hit 1000 times (thus triggering who knows how many more native tasks internally). Trying to find out where our task/rule stuff is to make it easier to check whether it's related to the sheer number of tasks, or whether each of the tasks is doing more than it should need to do because of the nested dependency calls causing the dep tree to spread out a lot.
Either way, I was able to see this was the problem by basically forcing the correct set of Addresses to be emitted on each of those calls (bypassing all of the rules below it), and I went from 18 seconds down to about 1-2fresh-mechanic-68429
09/29/2025, 4:58 PM@rule because of the last argument file_extensions
So I feel like its bypassing caching/memoization when maybe it needs it?wide-midnight-78598
09/29/2025, 5:02 PMwide-midnight-78598
09/29/2025, 5:05 PMwide-midnight-78598
09/29/2025, 5:13 PMdeno info index.js takes about 60msproud-dentist-22844
09/29/2025, 6:25 PMwide-midnight-78598
09/29/2025, 8:13 PMDoes the JS code not do the same thing?Nope: But personally, I would even think to send all the file requests in at the same time, since there's likely overlap - but that's an optimization I can't really say would be good without stats
wide-midnight-78598
09/29/2025, 8:19 PMfresh-mechanic-68429
09/29/2025, 9:10 PMwide-midnight-78598
09/29/2025, 9:12 PMwide-midnight-78598
09/29/2025, 9:15 PMfresh-mechanic-68429
09/29/2025, 9:15 PMwide-midnight-78598
09/29/2025, 9:16 PMwide-midnight-78598
09/29/2025, 9:16 PMfresh-mechanic-68429
09/29/2025, 9:16 PMwide-midnight-78598
09/29/2025, 9:20 PMproud-dentist-22844
09/29/2025, 9:21 PMwide-midnight-78598
09/29/2025, 9:28 PMfind_owners - kinda just noting that for myself. Looking for a quick workaround to make this less painful in the shorttermfresh-mechanic-68429
10/10/2025, 3:46 AMpaths = await path_globs_to_paths(
_add_extensions(
candidates.file_imports,
file_extensions,
)
)
local_owners = await find_owners(OwnersRequest(paths.files), **implicitly())
The main difference in the python impl is that is builds a hashmap first, AllPythonTargets -> FirstPartyPythonModuleMapping, allowing all the lookups to be done in memory rather than sequential checks to disk.
If this sounds correct, would a suitable solution here be to implement the same approach on the js side? That can be done in just python, no rust changes required.
I've got a very rough poc going thats ~10x faster (2s instead of 22s) for that demo repowide-midnight-78598
10/12/2025, 12:47 PM