< hundreds breakfast 49010> A general thought on workunits w Pants #development

Join Slack

<@UKPS08ZGW>: A general thought on workunits: when...

# development

witty-crayon-22786

04/01/2020, 3:05 PM

@hundreds-breakfast-49010: A general thought on workunits: when in doubt, add more.

witty-crayon-22786

04/01/2020, 3:06 PM

In particular, adding more workunits inside of process execution would get you more detail about which phase of the process you are in.

witty-crayon-22786

04/01/2020, 3:06 PM

Creating the sandbox, running, capturing outputs, etc.

hundreds-breakfast-49010

04/01/2020, 9:57 PM

@witty-crayon-22786 adding workunits in more parts of the codebase is not a bad idea. but since workunits ultimately generate a user-facing report, we need some sort of way to make arbitrary workunits configurable by users

hundreds-breakfast-49010

04/01/2020, 9:57 PM

this is what named `@rule`s do, but we don't have that configurability for the other places in the pants codebase where we create a workunit

witty-crayon-22786

04/01/2020, 9:59 PM

sure. but process execution is an intrinsic. and a particularly important one

witty-crayon-22786

04/01/2020, 10:00 PM

so, let's say: you currently have a workunit for the process execution Node. that is outside the process execution lock.

witty-crayon-22786

04/01/2020, 10:01 PM

if you had another inside the lock, you could differentiate "time to run a process, accounting for time waited due to contention" from "time running the process", and you could subtract those to know how long you waited to start

hundreds-breakfast-49010

04/01/2020, 10:01 PM

and we do have this, since the presence of a

user_facing_name

is what controls whether a owrkunit is created, and all

MultiPlatofrmExecuteProcessRequests

have one

witty-crayon-22786

04/01/2020, 10:02 PM

if you then add more workunits below the lock, you would be tracking the time for each phase

witty-crayon-22786

04/01/2020, 10:13 PM

@hundreds-breakfast-49010: understood. my point is that there is information beyond just "which Node is running" here. especially with processes, there is more fine-grained, useful information available.

👍 1

witty-crayon-22786

04/01/2020, 10:13 PM

(or... could be with a few more workunits added)

witty-crayon-22786

04/01/2020, 10:13 PM

i don't know exactly how that maps to the CLI ui, but i think that it does.

hundreds-breakfast-49010

04/01/2020, 10:14 PM

workunits don't actually map directly to the CLI UI currently

hundreds-breakfast-49010

04/01/2020, 10:15 PM

workunits control what zipkin JSON gets generated. toolchain cares about this because we have a plugin that does some post-processing on that JSON

witty-crayon-22786

04/01/2020, 10:15 PM

so: i think that if you get the workunits right, and there is enough information there, one path forward for the CLI UI might be to walk a graph of workunits, rather than the capital-G Graph of Nodes

hundreds-breakfast-49010

04/01/2020, 10:15 PM

the timings displayed in the v2 UI are separate

hundreds-breakfast-49010

04/01/2020, 10:15 PM

but yes I do think it makes sense for those to be the same system

hundreds-breakfast-49010

04/01/2020, 10:15 PM

we should be calculating the timing of nodes in one canonical place, and using those timings everwhere it's relevant to display that

hundreds-breakfast-49010

04/01/2020, 10:16 PM

that's part of what my PR from a few weeks ago was meant to do, and I'll probably return to that in the near future

witty-crayon-22786

04/01/2020, 10:17 PM

the point above though is less about "where the timing is captured", and more about "where are timings stored"

witty-crayon-22786

04/01/2020, 10:19 PM

if the CLI UI was rendering from the WorkunitStore (for example) rather than the Graph, then it would render the more fine-grained process information automatically, without any additional Nodes being created

hundreds-breakfast-49010

04/01/2020, 10:19 PM

sure. but those are actually being stored in two places too

hundreds-breakfast-49010

04/01/2020, 10:19 PM

ah, yeah, I think that makes sense

hundreds-breakfast-49010

04/01/2020, 10:20 PM

ultimately I don't think the

EntryState::Running { start_time, ... }

field should exist. everything we do with that currently should be done with a

WorkUnit

witty-crayon-22786

04/01/2020, 10:20 PM

i think that i agree.

witty-crayon-22786

04/01/2020, 10:21 PM

because we had already discussed not rendering it except for Processes anyway.

hundreds-breakfast-49010

04/01/2020, 10:22 PM

right. I think it might also make sense to render

Snapshots

, since disk IO is something that I think I"ve noticed taking a noticably long time

witty-crayon-22786

04/01/2020, 10:22 PM

yea.

hundreds-breakfast-49010

04/01/2020, 10:23 PM

I also wonder if it makes sense to always create

WorkUnit

data structures and put them into the store, and to do the filtering of what workunits to pass to python at a different level

witty-crayon-22786

04/01/2020, 10:23 PM

unclear. possibly.

hundreds-breakfast-49010

04/01/2020, 10:23 PM

right now there's code in

<http://nodes.rs|nodes.rs>

that will either create a workunit or not depending on the value of

user_facing_name

for a node

witty-crayon-22786

04/01/2020, 10:23 PM

yep.

hundreds-breakfast-49010

04/01/2020, 10:23 PM

maybe intstead that should set a

render: bool

field on

WorkUnit

witty-crayon-22786

04/01/2020, 10:24 PM

well, that's good i think....?

witty-crayon-22786

04/01/2020, 10:24 PM

i think it's good that not all Nodes end up rendered. some just won't be useful.

witty-crayon-22786

04/01/2020, 10:24 PM

a variant on this though

hundreds-breakfast-49010

04/01/2020, 10:24 PM

the only reason to care is if we want to have different rules about what we display in the v2 UI and what we pass to zipkin/the toolchain plugin

hundreds-breakfast-49010

04/01/2020, 10:24 PM

but I think we probalby do want that

witty-crayon-22786

04/01/2020, 10:25 PM

maybe. unclear. i could imagine unification there... see what i was saying about "adding more workunits" and wanting them rendered.

witty-crayon-22786

04/01/2020, 10:25 PM

so, another tack for a second

hundreds-breakfast-49010

04/01/2020, 10:25 PM

sure

witty-crayon-22786

04/01/2020, 10:26 PM

if we continue to walk the Graph of Nodes to render the CLI ui:

witty-crayon-22786

04/01/2020, 10:26 PM

it's possible that

user_facing_name

could get a little bit more rich and become "current rendered line", and could 1) change names over time 2) include runtime

witty-crayon-22786

04/01/2020, 10:27 PM

the implementation for a process would be to (for the same Node), change its name based on which phase it was in.

witty-crayon-22786

04/01/2020, 10:28 PM

... that is maybe challenging.

witty-crayon-22786

04/01/2020, 10:28 PM

... since it doesn't answer the storage question.

hundreds-breakfast-49010

04/01/2020, 10:28 PM

I don't think the

WorkUnit

or the

Node

is the right place for that information to canonically live. that's a v2 UI concern

hundreds-breakfast-49010

04/01/2020, 10:28 PM

user_facing_name

is part of what you need to build up the v2 UI line, but not all of it

witty-crayon-22786

04/01/2020, 10:29 PM

well, that's why i was suggesting that it could include runtime. but yea.

hundreds-breakfast-49010

04/01/2020, 10:29 PM

toolchain wants

user_facing_name

, at least for rules, to be a static, canonical name for a python

@rule

hundreds-breakfast-49010

04/01/2020, 10:30 PM

a string that includes the runtime should be something other than

user_facing_name

hundreds-breakfast-49010

04/01/2020, 10:30 PM

(maybe we should rename the property of a node that is

user_facing_name

to something else, like

canonical_name

witty-crayon-22786

04/01/2020, 10:30 PM

i also suggested a new name for the function, heh 😃

witty-crayon-22786

04/01/2020, 10:30 PM

but anyway: that tangent is probably less interesting than having the CLI UI consume workunits.

hundreds-breakfast-49010

04/01/2020, 10:31 PM

yeah

witty-crayon-22786

04/01/2020, 10:31 PM

because "nodes are always the most useful granularity to render" doesn't really play out.

hundreds-breakfast-49010

04/01/2020, 10:31 PM

although for me right now, the v2 UI is a secondary priority to making improvements to the workunit json

witty-crayon-22786

04/01/2020, 10:31 PM

more fine-grained information about a process makes total sense, i think.

hundreds-breakfast-49010

04/01/2020, 10:31 PM

I guess I should say the "zipkin json"

hundreds-breakfast-49010

04/01/2020, 10:33 PM

I've been doing some testing on my own machine in the last day or so, and it actually seems like workunit associated with the

MultiPlatformExecuteProcessRequest

is capturing more or less the amount of real time that EPRs are taking

hundreds-breakfast-49010

04/01/2020, 10:33 PM

certainly as you say adding more granularity couldn't hurt

hundreds-breakfast-49010

04/01/2020, 10:33 PM

that workunit is outside the async lock, and we could add them inside the lock

hundreds-breakfast-49010

04/01/2020, 10:33 PM

er, add additional workunits inside the lock

witty-crayon-22786

04/01/2020, 10:33 PM

yep

witty-crayon-22786

04/01/2020, 10:34 PM

there is some overhead to the sandbox itself, and i don't think that we know what it is.

witty-crayon-22786

04/01/2020, 10:34 PM

but yea, good to hear.

hundreds-breakfast-49010

04/01/2020, 10:34 PM

I want to keep my eye on the big picture. the two high-level problems I need to solve right now are 1) make sure that the JSON workunit/zipkin output that the toolchain uses is useful and actionable and 2) make sure that the timestamps in the v2 console UI are displaying useful times and not capturing time when some process is waiting to run

hundreds-breakfast-49010

04/01/2020, 10:35 PM

it's definitely possible that there's sandbox overhead that didn't show up when I was testing yesterday with my sepcific workflows, but that is relevant in other workflows

witty-crayon-22786

04/01/2020, 10:38 PM

afaict, this thread touches both of those. a workunit inside the lock would improve (1). and then additionally switching the CLI to consuming the workunit graph would fix (2)

hundreds-breakfast-49010

04/01/2020, 10:39 PM

yeah I think it would

hundreds-breakfast-49010

04/01/2020, 10:41 PM

I'm also doing other work pertaining to workunits right now, to fix specific problems toolchain has, and I want to make sure that any redesigns to where workunits get created fits in there

hundreds-breakfast-49010

04/01/2020, 10:41 PM

btw, yesterday @fast-nail-55400 informed me about this crate: https://docs.rs/tracing/0.1.13/tracing/index.html#using-the-macros

hundreds-breakfast-49010

04/01/2020, 10:41 PM

which has some functionality for explicitly tracing and profiling `Future`s

witty-crayon-22786

04/01/2020, 10:41 PM

yes.

witty-crayon-22786

04/01/2020, 10:42 PM

probably worth using at some point.

hundreds-breakfast-49010

04/01/2020, 10:42 PM

with the idea that we might be able to use this replace the workunit store data structure

hundreds-breakfast-49010

04/01/2020, 10:42 PM

I spent a couple hours playing with it yesterday and found it hard to get a proof of concept working

hundreds-breakfast-49010

04/01/2020, 10:43 PM

I'm not sure what I think about that yet, I haven't spent enough time trying to get the library to work so far, but I was curious if you've thought about whether

tracing

annotations on various

Future

-returning functions could do the same thing as the

workunit_store

code?

witty-crayon-22786

04/01/2020, 10:44 PM

yea. it doesn't have any particular secret sauce for recording futures, afaik. it's mostly just good that it is standardized.

witty-crayon-22786

04/01/2020, 10:44 PM

tracing annotations would ... cause more workunits to be created, yes.

hundreds-breakfast-49010

04/01/2020, 10:47 PM

mostly what I'm concerned about is not creating more workunits or more workunit-like data structures without having some way to filter them that current clients of workunits (i.e. zipkin tracing and the toolchain plugin) can live with

witty-crayon-22786

04/01/2020, 10:47 PM

yea, that would be one concern with the

tracing

crate, i think.

hundreds-breakfast-49010

04/01/2020, 10:47 PM

it's also a concern with adding more workunits under the lock

witty-crayon-22786

04/01/2020, 10:48 PM

since it would be standardized, you'd have spans/workunits being created in a bunch more places

witty-crayon-22786

04/01/2020, 10:48 PM

so you'd need to filter them.

witty-crayon-22786

04/01/2020, 10:48 PM

@hundreds-breakfast-49010: well, it's under your control at least. if it's too noisy, can remove it again, heh

hundreds-breakfast-49010

04/01/2020, 10:48 PM

but yeah if we need to build an additional workunit filtering layer anyway, then standadly-creating tracing structures is less of a problem

hundreds-breakfast-49010

04/01/2020, 10:49 PM

yeah we can write the toolchain plugin however we want 🙂

hundreds-breakfast-49010

04/01/2020, 10:50 PM

but I want to make sure that any changes I introduce to mainline pants are clean and sensible and don't cause other problems down the line

witty-crayon-22786

04/01/2020, 10:50 PM

sorry, i meant that if recording the workunit in pants doesn't look useful, can stop doing it. i wasn't suggesting consumer-side filtering.

witty-crayon-22786

04/01/2020, 10:50 PM

(maybe you need that anyway, but)

hundreds-breakfast-49010

04/01/2020, 10:51 PM

I think that any client of workunits, including toolchain's, will and can be expected to do some of its own filtering

witty-crayon-22786

04/01/2020, 10:52 PM

Open in Slack

Previous Next