I'm playing around with trying to profile a Pants ...
# plugins
b
I'm playing around with trying to profile a Pants runs by using a WorkUnitCallback, and noticed some gaps in the workunit events that the callback sees. Namely a huge gap where
resolving requirements.pex
could be 🤔
h
Do you see it if you use --no-local-cache and --no-Pantsd?
b
I can try. I woulding imagine the ~40seconds it took for that resolution wasn't it pulling it from the cache 😬
I do feel like I see more "things" output on the dyanmic UI than I see in the workunits callback
It's also possible my callback code is wrong 🙂
No dice.
Here's what I see when linting a single file (FWIW I use
flake8
,
pylint
,
isort
, and
yapf
. But I only see Pylint)
It's also possible my callback code is wrong
Thats it 🙂
Much more colorful 🙂
Screenshot from 2021-12-22 09-40-44.png
🙌 2
However, I'm still scratching my head on how to add PID (and maybe TID if it's important) to the WorkUnitMetadata so the entries aren't overlapped
I tried adding it in Rustland, but it's always the same PID (which probably makes sense... but I don't know why 😉 )
h
Oh interesting, yeah we don't track the PID currently. What's the use case here?
c
That’s really cool stuff @bitter-ability-32190 😄
b
You can't tell from the screenshot, but there are multiple overlapping entries where multiple processes are running at the same time. I'm goofing off and trying to make a plugin (or Pants contribution) that'll output a Chrome trace JSON
📣 2
Chrome Trace JSON is pretty powerful. Right now I'm just using the simple data format, which understands PIDs and TIDs, and will render them in different swimlanes
If possible, I'd like to make the change(s) to Pants myself to help me understand the engine/architecture better. So feel free to reach out and I can pester whoever with questions 😛
💯 1
h
Cool! Stu if out for Christmas, but I can try taking a look too at where this code would go. Maybe open an issue about tracking PIDs/TIDs in workunit data? Will be helpful to centralize that convo
b
I'll keep goofing off a bit more and then put it on the shelf
h
The background is that zipkin chokes on large traces
h
Related: Rust is awesome, it's my favorite part of contributing to Pants. Lmk if you want help finding a starter issue if you want to dive into Rust
🙌 1
b
@happy-kitchen-89482 but also chrome JSON traces are great. Regardless of zipkin I'd likely prefer them since tools (like perfetto.dev) understand them
👀 1
I started with
speedscope
but that doesn't support multi-process
@hundreds-father-404 I was a C++ "guru" 🤮 at my last company a few years ago so Rust is too far off from what I already know. I also know it's a better C++ and advocated that we switch, even without "knowing" it 😂
❗ 1
❤️ 1
Chrome Traces also support more interesting use-case pother than event spans (nested start+end) like asynchronous events (which I've never used). So if you wanted to get really detailed, you could model Pants' asynchronous behavior pretty closely (have the resulting graph possibly show tasks that were started/stopped)
w
since this is async code, flamegraphs are unlikely to be useful. at various points in time, there will likely be thousands to tens of thousands of “threads” (they’re not threads)
🤔 1
but really glad that you’re looking at this! one useful avenue is likely to be looking at “self time” vs “cumulative time” for workunits
it’s the next thing i was planning on exploring for overall performance.
the only thing that is likely to be tricky is that because workunits form a tree rather than a dag (even though the underlying structure is a dag), when the “second parent” of a node in the graph adds a dependency on something that is already running, it won’t actually end up with a child (because it wouldn’t form a tree). unfortunately that means that that “second parent” would end up with a bunch of false self-time: it’s blocked waiting for a node, but that’s not recorded in the workunit tree.
having said that: the reason that they form a tree is mostly in order to support profilers like Zipkin and Chrome. and it would be easy enough to transform a DAG into a tree. so maybe the best solution would be to migrate to actually storing workunits (and giving them to WorkunitsCallbacks) as DAGs, and to then let whomever wants to render in Zipkin/Chrome transform to a tree.
➕ 3
b
For user profiling (which I imagine can be less detailed than y'all's internal needs) capturing the events at the process level seems to be "good enough". I just want to be able to take a trace and go "you spent 40 seconds resolving requirements" or "you spent 10 seconds running lint on this one file", etc...
w
ah, sure. in that case you can filter to nodes of type
run_local_process
🤔 1
➕ 1
or, if you’re interested in the time taken to do cache lookups or sandbox setup, can filter to all children of
multi_platform_process
nodes.
b
I think we've stepped outside my bounds of understanding 😅
But it sounds like something like this is on the radar, and obviously y'all have a much deeper and contextual understanding of the code and possible solution(s) 😉
w
mm, i think that i thought of a way to do what you’re trying to do with flamegraphs: https://github.com/pantsbuild/pants/issues/13960#issuecomment-999723884
at least for your purposes, the fact that we lie and treat the DAG as a tree is helpful (although a bit misleading)
b
FWIW filtering sub-trees using roots of
process
seemingly works
w
Neat! Yea, other potentially useful roots might be the roots for the pytest runner: https://github.com/pantsbuild/pants/blob/b89cdc18374d666c5a2b2ad60de4ec686ed3c1be/src/python/pants/backend/python/goals/pytest_runner.py#L324 ... then you could see everything that contributes time to setting up and then running the test
The
@rule(desc=...)
and rule function name end up in workunits.