I noticed some long blocking ~2s here on `acquire command ru Pants #general

I noticed some long blocking (~2s here) on `acquir...

fresh-mechanic-68429

03/29/2024, 1:34 AM

I noticed some long blocking (~2s here) on

acquire_command_runner_slot

in a pretty small repository when pants is searching for binaries. I think this is just caused by high parallelism (searching for lots of binaries/doing other tasks) and the lack of pantsd (its disabled in our CI / for me locally here) Has there been any discussion about exempting some tasks from the slot runner concurrency control / making I/O heavy tasks count for < 1 cpu core?

curved-manchester-66006

03/29/2024, 4:47 PM

Sorry I don't have an answer for your original question: But is that screenshot from passing Pants runs off to open tracing?

fresh-mechanic-68429

03/29/2024, 4:56 PM

Close! it's a honeycomb plugin I put together using the work unit logger API I thought about using otel but I haven't bothered yet and it's likely to be more annoying given how their API works https://pantsbuild.slack.com/archives/C046T6T9U/p1709072280205739?thread_ts=1709062795.435399&cid=C046T6T9U

curved-manchester-66006

03/29/2024, 5:03 PM

👀 ! As honeycomb user, I'd certainly be interested in taking the plugin for a spin if it was something you could open source

fresh-mechanic-68429

03/29/2024, 5:17 PM

We’ll probably open source it in time, I just threw it together yesterday 😅 If anyone wants to do something similar, the

WorkunitLoggerCallback

provides all the scaffolding you need. Just need to change the transformations to be appropriate and add an http call https://github.com/pantsbuild/pants/blob/main/src/python/pants/backend/tools/workunit_logger/rules.py#L69-L70

🔥 1

broad-processor-92400

03/31/2024, 12:50 AM

It's not a perfect answer but... does increasing the various concurrency numbers work? I guess it might overload the machine for CPU-heavy tasks, though

fresh-mechanic-68429

04/02/2024, 5:42 PM

I assume it would allow those tasks to run earlier, but yeah there is real cpu bound work they’re competing with, so its not a great solution

👍 1

powerful-scooter-95162

07/01/2024, 3:52 PM

Too bad this isn't otel, I would have been interested in trying this on our builds. What about the API makes this difficult? The wire protocol itself seems like it should support it without too many issues https://opentelemetry.io/docs/concepts/signals/traces/

fresh-mechanic-68429

07/04/2024, 3:05 AM

I don’t think those blobs directly correspond to the wire format. OTLP/JSON or OTLP/HTTP are the wire formats (and involve protobufs) that you’d want to send to something like the OTel Collector (or a vendor) There might be a way to manually create the spans and push them through an exporter, but I haven’t tried. I’ve only used otel (in other languages) in the “normal” way, where you’re recording the span around the operation that its doing, not after the fact with all the data available (which the callback logger is doing) https://opentelemetry.io/docs/languages/python/instrumentation/ https://opentelemetry.io/docs/specs/otlp/

powerful-scooter-95162

07/04/2024, 2:54 PM

the JSON/Proto format is not hugely different looking from eyeballing it; here's an example: https://github.com/open-telemetry/opentelemetry-proto/blob/ff457cecf46cf219602e587d86d66f3b8cb3efe6/examples/trace.json#L4 and the actual span proto definition: https://github.com/open-telemetry/opentelemetry-proto/blob/ff457cecf46cf219602e587d86d66f3b8cb3efe6/opentelemetry/proto/trace/v1/trace.proto#L86 (this is transitively included in ExportTraceServiceRequest)

powerful-scooter-95162

07/04/2024, 2:56 PM

mainly it has the startTimeUnixNano and endTimeUnixNano to push things after the fact

2 Views

Open in Slack

Previous Next