I m playing around with trying to profile a Pants runs by us Pants #plugins

I'm playing around with trying to profile a Pants ...

bitter-ability-32190

12/22/2021, 3:11 PM

I'm playing around with trying to profile a Pants runs by using a WorkUnitCallback, and noticed some gaps in the workunit events that the callback sees. Namely a huge gap where

resolving requirements.pex

could be 🤔

hundreds-father-404

12/22/2021, 3:13 PM

Do you see it if you use --no-local-cache and --no-Pantsd?

bitter-ability-32190

12/22/2021, 3:15 PM

I can try. I woulding imagine the ~40seconds it took for that resolution wasn't it pulling it from the cache 😬

bitter-ability-32190

12/22/2021, 3:16 PM

I do feel like I see more "things" output on the dyanmic UI than I see in the workunits callback

bitter-ability-32190

12/22/2021, 3:19 PM

It's also possible my callback code is wrong 🙂

bitter-ability-32190

12/22/2021, 3:23 PM

No dice.

bitter-ability-32190

12/22/2021, 3:24 PM

Here's what I see when linting a single file (FWIW I use

flake8

pylint

isort

, and

yapf

. But I only see Pylint)

bitter-ability-32190

12/22/2021, 3:31 PM

It's also possible my callback code is wrong

Thats it 🙂

bitter-ability-32190

12/22/2021, 3:40 PM

Much more colorful 🙂

bitter-ability-32190

12/22/2021, 3:41 PM

Screenshot from 2021-12-22 09-40-44.png

🙌 2

bitter-ability-32190

12/22/2021, 3:44 PM

However, I'm still scratching my head on how to add PID (and maybe TID if it's important) to the WorkUnitMetadata so the entries aren't overlapped

bitter-ability-32190

12/22/2021, 3:45 PM

I tried adding it in Rustland, but it's always the same PID (which probably makes sense... but I don't know why 😉 )

hundreds-father-404

12/22/2021, 3:46 PM

Oh interesting, yeah we don't track the PID currently. What's the use case here?

curved-television-6568

12/22/2021, 3:47 PM

That’s really cool stuff @bitter-ability-32190 😄

bitter-ability-32190

12/22/2021, 3:47 PM

You can't tell from the screenshot, but there are multiple overlapping entries where multiple processes are running at the same time. I'm goofing off and trying to make a plugin (or Pants contribution) that'll output a Chrome trace JSON

📣 2

bitter-ability-32190

12/22/2021, 3:49 PM

Chrome Trace JSON is pretty powerful. Right now I'm just using the simple data format, which understands PIDs and TIDs, and will render them in different swimlanes

bitter-ability-32190

12/22/2021, 3:50 PM

If possible, I'd like to make the change(s) to Pants myself to help me understand the engine/architecture better. So feel free to reach out and I can pester whoever with questions 😛

💯 1

hundreds-father-404

12/22/2021, 3:56 PM

Cool! Stu if out for Christmas, but I can try taking a look too at where this code would go. Maybe open an issue about tracking PIDs/TIDs in workunit data? Will be helpful to centralize that convo

bitter-ability-32190

12/22/2021, 3:58 PM

https://github.com/pantsbuild/pants/issues/13960

bitter-ability-32190

12/22/2021, 4:03 PM

I'll keep goofing off a bit more and then put it on the shelf

happy-kitchen-89482

12/22/2021, 4:09 PM

The background is that zipkin chokes on large traces

hundreds-father-404

12/22/2021, 4:09 PM

Related: Rust is awesome, it's my favorite part of contributing to Pants. Lmk if you want help finding a starter issue if you want to dive into Rust

🙌 1

bitter-ability-32190

12/22/2021, 4:11 PM

@happy-kitchen-89482 but also chrome JSON traces are great. Regardless of zipkin I'd likely prefer them since tools (like perfetto.dev) understand them

👀 1

bitter-ability-32190

12/22/2021, 4:11 PM

I started with

speedscope

but that doesn't support multi-process

bitter-ability-32190

12/22/2021, 4:12 PM

@hundreds-father-404 I was a C++ "guru" 🤮 at my last company a few years ago so Rust is too far off from what I already know. I also know it's a better C++ and advocated that we switch, even without "knowing" it 😂

❗ 1

❤️ 1

bitter-ability-32190

12/22/2021, 4:14 PM

Chrome Traces also support more interesting use-case pother than event spans (nested start+end) like asynchronous events (which I've never used). So if you wanted to get really detailed, you could model Pants' asynchronous behavior pretty closely (have the resulting graph possibly show tasks that were started/stopped)

witty-crayon-22786

12/22/2021, 4:16 PM

since this is async code, flamegraphs are unlikely to be useful. at various points in time, there will likely be thousands to tens of thousands of “threads” (they’re not threads)

🤔 1

witty-crayon-22786

12/22/2021, 4:17 PM

but really glad that you’re looking at this! one useful avenue is likely to be looking at “self time” vs “cumulative time” for workunits

witty-crayon-22786

12/22/2021, 4:18 PM

it’s the next thing i was planning on exploring for overall performance.

witty-crayon-22786

12/22/2021, 4:22 PM

the only thing that is likely to be tricky is that because workunits form a tree rather than a dag (even though the underlying structure is a dag), when the “second parent” of a node in the graph adds a dependency on something that is already running, it won’t actually end up with a child (because it wouldn’t form a tree). unfortunately that means that that “second parent” would end up with a bunch of false self-time: it’s blocked waiting for a node, but that’s not recorded in the workunit tree.

witty-crayon-22786

12/22/2021, 4:27 PM

having said that: the reason that they form a tree is mostly in order to support profilers like Zipkin and Chrome. and it would be easy enough to transform a DAG into a tree. so maybe the best solution would be to migrate to actually storing workunits (and giving them to WorkunitsCallbacks) as DAGs, and to then let whomever wants to render in Zipkin/Chrome transform to a tree.

➕ 3

bitter-ability-32190

12/22/2021, 4:28 PM

For user profiling (which I imagine can be less detailed than y'all's internal needs) capturing the events at the process level seems to be "good enough". I just want to be able to take a trace and go "you spent 40 seconds resolving requirements" or "you spent 10 seconds running lint on this one file", etc...

witty-crayon-22786

12/22/2021, 4:29 PM

ah, sure. in that case you can filter to nodes of type

run_local_process

🤔 1

➕ 1

witty-crayon-22786

12/22/2021, 4:30 PM

or, if you’re interested in the time taken to do cache lookups or sandbox setup, can filter to all children of

multi_platform_process

nodes.

witty-crayon-22786

12/22/2021, 4:33 PM

those two are recorded on the rust side like so: https://github.com/pantsbuild/pants/blob/355d873899f31165a4e2ff1b84bc9aa3fb1c5575/src/rust/engine/process_execution/src/local.rs#L258-L268

👀 1

bitter-ability-32190

12/22/2021, 4:43 PM

I think we've stepped outside my bounds of understanding 😅

bitter-ability-32190

12/22/2021, 4:44 PM

But it sounds like something like this is on the radar, and obviously y'all have a much deeper and contextual understanding of the code and possible solution(s) 😉

witty-crayon-22786

12/22/2021, 4:54 PM

mm, i think that i thought of a way to do what you’re trying to do with flamegraphs: https://github.com/pantsbuild/pants/issues/13960#issuecomment-999723884

witty-crayon-22786

12/22/2021, 4:56 PM

at least for your purposes, the fact that we lie and treat the DAG as a tree is helpful (although a bit misleading)

bitter-ability-32190

12/22/2021, 6:21 PM

FWIW filtering sub-trees using roots of

process

seemingly works

witty-crayon-22786

12/22/2021, 6:32 PM

Neat! Yea, other potentially useful roots might be the roots for the pytest runner: https://github.com/pantsbuild/pants/blob/b89cdc18374d666c5a2b2ad60de4ec686ed3c1be/src/python/pants/backend/python/goals/pytest_runner.py#L324 ... then you could see everything that contributes time to setting up and then running the test

witty-crayon-22786

12/22/2021, 6:33 PM

The

@rule(desc=...)

and rule function name end up in workunits.

Open in Slack

Previous Next