I m looking at a performance bottleneck in our js monorepo a Pants #development

I'm looking at a performance bottleneck in our js ...

fresh-mechanic-68429

09/28/2025, 6:06 PM

I'm looking at a performance bottleneck in our js monorepo and trying to figure out what can be done about it. It appears

infer_js_source_dependencies

is taking ~30 seconds to run when I'm running a single test (no pantsd). I think

hydrate_sources

is the bottleneck but I'm having a hard actually verifying that. My py-spy profile is mostly empty for some reason and profiling in OSX's Instruments shows time spent in the python interpreter, but I'm don't know how to get pystack frames out of it. Looking for advice on: • How should I approach profiling this? The number of async rules being invoked makes it a bit tricky • Is

hydrate_sources

necessary? Does the python dependency inference have the same issue? Some size info, there are ~1k

infer_js_source_dependencies

calls (I think thats about the size of the transitive sources for the command I ran. total sources in repo is larger) https://github.com/pantsbuild/pants/blob/1aeddf653e99f4b5404a0627ed3e135574210044/[…]c/python/pants/backend/javascript/dependency_inference/rules.py

wide-midnight-78598

09/28/2025, 9:28 PM

That's interesting, it runs that EVERY time?

wide-midnight-78598

09/28/2025, 9:28 PM

Like, none of it gets cached? (Ignoring how long it takes, just for the moment)

fresh-mechanic-68429

09/28/2025, 9:30 PM

Pantsd caches some of this, but I'm looking at the cold case for our infrequent users/ci runs. The local disk cache doesn't appear to be doing much afaict

wide-midnight-78598

09/28/2025, 9:32 PM

🤯 I noticed some performance oddities the other day while trying to debug something, I assumed it was a "my machine" thing.

wide-midnight-78598

09/28/2025, 9:35 PM

Assuming that parse_javascript_deps is doing what I think it's doing, it should be running tree-sitter on that code

wide-midnight-78598

09/28/2025, 9:37 PM

If that's the case, I wonder if we're doing too much, or if there is some indirection we don't need. Tree sitter is absurdly fast, so seeing even 70-80ms feels like a lot for a file

fresh-mechanic-68429

09/28/2025, 9:38 PM

Yeah I believe that's right, there's a bunch of python rules that get invoked before treesittter though

wide-midnight-78598

09/28/2025, 9:41 PM

Still, feels slow just off the cuff. I'd have to dig through the code. THe

parse_javascript_deps

part seems pretty straight through to rust though - with the python indirection. So I would have thought (without any benchmarking or debugging) that the time shown there is pretty much just parse time

wide-midnight-78598

09/28/2025, 9:42 PM

Interestingly, if we really wanted to - we could write custom grammars for only the stuff we want - and cut down that time even mroe. But that's not really the big problem here (though, it's surprisingly a lot)

wide-midnight-78598

09/28/2025, 9:43 PM

The 10:1 is the big problem 🤦‍♂️

fresh-mechanic-68429

09/28/2025, 9:48 PM

From my trace (first image) it's not parse_javascript_dependencies that's taking the most time though. It's hydrate sources

wide-midnight-78598

09/28/2025, 9:53 PM

Yeah, that's what I mean. While I think the straight parsing is still kinda slow, that's not the bulk of the problem. It's pulling the file contents into memory and passing that to the rust parser(s) (I think docker, python, and javascript all work the same way). I'm surprised by that, as I thought that would have been handled in the rust code - in the event we already have cached contents, or in the event we wanted to get contents as soon as it runs.

wide-midnight-78598

09/28/2025, 9:53 PM

How big are these JS files?

wide-midnight-78598

09/28/2025, 9:59 PM

And if you have some reproduction repo or some code in the codebase where you can replicate the problem you see (even in the tests of our JS backend) - then I can take a look at seeing what's going on with that hydration (e..g it makes sense that we need to hydrate, but whether we're doing it optimally - that's the question) As for your profiling question. I use mac, and trying to use instruments with anything other than Swift/ObjC apps is such a pain - I hate it

fresh-mechanic-68429

09/28/2025, 10:03 PM

I don't think they're abnormally large, but I'll look through them when I'm back at my machine to see if there are some outliers

👍 1

fast-nail-55400

09/28/2025, 11:55 PM

How should I approach profiling this? The number of async rules being invoked makes it a bit tricky

You can use https://github.com/shoalsoft/shoalsoft-pants-opentelemetry-plugin to send workunit tracing data to any OpenTelemetry-compatible system including a local instance of Jaeger.

fast-nail-55400

09/28/2025, 11:56 PM

The plugin also has an option to write traces to a jsonl file if you want to do scripting with the trace data.

fresh-mechanic-68429

09/29/2025, 1:25 AM

Thanks @fast-nail-55400 My otel setup is ~equivalent, but since yours is open source I can embed it in the reproduction more easily 🙏 fyi I think your readme need to be updated for the local Jaeger configuration, it was trying to send to the wrong port without this

Copy code

[shoalsoft-opentelemetry]
enabled = true
exporter_traces_endpoint = "<http://localhost:4318/v1/traces>"

fresh-mechanic-68429

09/29/2025, 1:43 AM

Heres a reproduction, 1000 js files takes ~17 seconds to infer deps This is the jaeger view of the trace, similar to what I was showing in honeycomb but with less workunits here (I'm using a release version of pants instead of a local one I added more workunits to) https://github.com/chris-smith-zocdoc/pants-slow-js-deps-demo

🤯 1

fresh-mechanic-68429

09/29/2025, 2:04 AM

If you set

--streaming-workunits-level=trace

there are ~28k

generator

spans, which seems like it could be related to the problem

fresh-mechanic-68429

09/29/2025, 2:39 AM

Modifying the test script to generate the same structure with python, I see 800ms in transitive_targets, significantly better. So appears to be something specific to the js rules

wide-midnight-78598

09/29/2025, 2:46 AM

Yep, just ran it - it was rough

wide-midnight-78598

09/29/2025, 3:11 AM

I made some changes and it still runs pretty slow - I can try to dig into this one a bit during the week.

gratitude thank you 1

wide-midnight-78598

09/29/2025, 3:55 AM

Thanks for the repros - I found a problem, but I don't know if it's a JS thing - or more general... To hydrate the python files takes a couple hundred ms (max) which makes some sense. The JS files exhibit a problem, where we start off fast, progressively slow down to 5 seconds per hydrate, then "something" happens, and we go back to fast, ramp up to 5s, snap back to 0, ramp up, etc.

wide-midnight-78598

09/29/2025, 3:56 AM

I'll need to take a look into how the hydration works, as it's been years - and then where the JS files are stored - to see if it's any different to python for some reason. Either that, or I might be seeing a symptom of something else - like overworking some part of engine

wide-midnight-78598

09/29/2025, 4:01 AM

Yeah, might be a symptom - basically everything is getting clogged up, and all the awaits are backed up on something in the engine - should be a pretty direct thing to look into and I have some ideas

wide-midnight-78598

09/29/2025, 3:02 PM

I've been absolutely nerd sniped by this problem. I love it (as much as you hate it). The test repo is chef's kiss also. We need a good performance-centric, dependency-inference generator for all backends (kinda like this repo, and what https://github.com/jriddy/clam-diggers was). Super useful in debugging something else I've been cooking up on the side.

wide-midnight-78598

09/29/2025, 4:55 PM

Alright, think I've narrowed down the problem. Feels like a thundering herd-ish problem, but might be something simpler than that in the end. Fundamentally, the problem seems to be

_determine_import_from_candidates

https://github.com/pantsbuild/pants/blob/c45acc27e3d6886e2628be3905e400134fa69f40/[…]c/python/pants/backend/javascript/dependency_inference/rules.py There is A LOT of stuff that happens in that function, including duplicate work, when you start jumping into those awaited calls. And in the contrived example, it's hit 1000 times (thus triggering who knows how many more native tasks internally). Trying to find out where our task/rule stuff is to make it easier to check whether it's related to the sheer number of tasks, or whether each of the tasks is doing more than it should need to do because of the nested dependency calls causing the dep tree to spread out a lot. Either way, I was able to see this was the problem by basically forcing the correct set of Addresses to be emitted on each of those calls (bypassing all of the rules below it), and I went from 18 seconds down to about 1-2

fresh-mechanic-68429

09/29/2025, 4:58 PM

One interesting thing to note about that function. I wasn't able to annotate it with

@rule

because of the last argument

file_extensions

So I feel like its bypassing caching/memoization when maybe it needs it?

wide-midnight-78598

09/29/2025, 5:02 PM

It's possible, but it's being called once per import - we have 5 imports per file, for the first 200 files. I don't know if it is a caching issue necessarily, because it runs the same number of times as the imports. Whether it should do that is a question mark.

wide-midnight-78598

09/29/2025, 5:05 PM

Maybe there is some downstream cache being missed, but (and I can't verify this without being able to get more insight into the engine in realtime) we might just be creating a ton of tasks. If caching isn't working as we think, then yeah, 100% it'll be a problem, but we might just not need to do so much. One other thing I was thinking about is that for each backend's dependency inference, I think it should be more monolithically completed in the engine rust code - rather than trampolining data back and forth - as there just isn't much of a need for that. I would expect to call "infer_deps" with some input(s) and then get the fully formed results back, not anything I would need to call back in for

wide-midnight-78598

09/29/2025, 5:13 PM

As a comparison,

deno info index.js

takes about 60ms

proud-dentist-22844

09/29/2025, 6:25 PM

> I would expect to call "infer_deps" with some input(s) and then get the fully formed results back, not anything I would need to call back in for I think that's how the python dep inference works. A per-file request goes to the engine, that gets all of the imports and strings that could be imports or resources, and sends the whole thing back to python. There's a little bit of filtering of those possible matches on the python-side based on settings (like number of dots in the string), but the extraction happens in one pass through the engine. Does the JS code not do the same thing?

wide-midnight-78598

09/29/2025, 8:13 PM

Does the JS code not do the same thing?

Nope: But personally, I would even think to send all the file requests in at the same time, since there's likely overlap - but that's an optimization I can't really say would be good without stats

wide-midnight-78598

09/29/2025, 8:19 PM

@proud-dentist-22844 Do you know if we can insert our own dep inference via a union rule or similar? it'd be a bit greasy, but could be nice, especially when debugging. Edit: I specifically meant override the existing inference

fresh-mechanic-68429

09/29/2025, 9:10 PM

We have a way to override rules in our internal plugin, I can find that code if you want. I think there's also a thread on it from me a while back

wide-midnight-78598

09/29/2025, 9:12 PM

@fresh-mechanic-68429 If you could please - I'm doing something wrong, but I'm not sure what. My override is not running

wide-midnight-78598

09/29/2025, 9:15 PM

Ah, via patching?

fresh-mechanic-68429

09/29/2025, 9:15 PM

This is the thread, that should give you enough to get it working. If not I'll pull our code https://pantsbuild.slack.com/archives/C0D7TNJHL/p1750993686068529?thread_ts=1750911821.901069&cid=C0D7TNJHL

wide-midnight-78598

09/29/2025, 9:16 PM

Ah yes, okay, I remember this one. I think this was caused because of call by name and no good way to override rules

wide-midnight-78598

09/29/2025, 9:16 PM

THanks!

fresh-mechanic-68429

09/29/2025, 9:16 PM

Yeah

wide-midnight-78598

09/29/2025, 9:20 PM

👍

proud-dentist-22844

09/29/2025, 9:21 PM

Yeah. The ability to override rules was a casualty of the call-by-name migration (but the speed boost is worth it!).

👍 1

wide-midnight-78598

09/29/2025, 9:28 PM

Oh, I forgot to mention re: this problem, the bulk of the time is spent in

find_owners

- kinda just noting that for myself. Looking for a quick workaround to make this less painful in the shortterm

fresh-mechanic-68429

10/10/2025, 3:46 AM

I took a deeper look at this and how it differs from the python implementation. My rough understanding is that the js one is slow because of the repeated calls to disk for files which may or may not actually exist

Copy code

paths = await path_globs_to_paths(
    _add_extensions(
            candidates.file_imports,
            file_extensions,
        )
    )
local_owners = await find_owners(OwnersRequest(paths.files), **implicitly())

The main difference in the python impl is that is builds a hashmap first, AllPythonTargets -> FirstPartyPythonModuleMapping, allowing all the lookups to be done in memory rather than sequential checks to disk. If this sounds correct, would a suitable solution here be to implement the same approach on the js side? That can be done in just python, no rust changes required. I've got a very rough poc going thats ~10x faster (2s instead of 22s) for that demo repo

👀 1

wide-midnight-78598

10/12/2025, 12:47 PM

It sounds like a place to start, especially if it can be done in the Python side first. I'm working on some experiments to really drop that time - but I'm also looking at all of the dep inference backends to see if they can use unified dep inference code - with a drop-in per-backend set of rules. That way, we'd be able to have a more consistent inference time overall

11 Views

Open in Slack

Previous Next