we ve discussed having a result or errored state on workunit Pants #development

Join Slack

we’ve discussed having a result or “errored” state...

# development

witty-crayon-22786

11/16/2020, 9:06 PM

we’ve discussed having a result or “errored” state on workunits before, but we don’t currently have that

witty-crayon-22786

11/16/2020, 9:06 PM

it’s probably worth adding to help make traces more useful

witty-crayon-22786

11/16/2020, 9:07 PM

there are conventions in some of the distributed tracing frameworks for how to mark spans as having an error associated: these docs are about zipkin, where having an

error

property will turn the span “red” it looks like: https://docs.newrelic.com/docs/understand-dependencies/distributed-tracing/trace-api/trace-api-decorate-spans-attributes

witty-crayon-22786

11/16/2020, 9:07 PM

cc @hundreds-breakfast-49010, @polite-garden-50641

hundreds-breakfast-49010

11/16/2020, 9:08 PM

we currently have a distinction between in-progress and completed workunits, that we added in the context of implementing workunit streaming

witty-crayon-22786

11/16/2020, 9:08 PM

yep

hundreds-breakfast-49010

11/16/2020, 9:09 PM

and we have the ability to set a per-workunit debug level, which we use for error cases sometimes

hundreds-breakfast-49010

11/16/2020, 9:10 PM

I'm a little worried about adding additional state to workunits that does similar things to state on workunits that already exists

witty-crayon-22786

11/16/2020, 9:10 PM

oh, true. so maybe lifting the level to Error, and changing the description would be enough?

hundreds-breakfast-49010

11/16/2020, 9:10 PM

what's the difference in principle between a workunit with ERROR level and a workunit with some kind of error marking?

hundreds-breakfast-49010

11/16/2020, 9:10 PM

(that doesn't exist yet)

witty-crayon-22786

11/16/2020, 9:10 PM

there might not be one. it’s just different conventions in various systems

witty-crayon-22786

11/16/2020, 9:43 PM

so then i think maybe what might make sense would be to have

with_workunit

have a variant that expects a

Result

, and can add that information to the workunit automatically…?

witty-crayon-22786

11/16/2020, 9:45 PM

not sure exactly what the signature would be, but maybe:

Copy code

pub async fn with_result_workunit<F, M>(
  workunit_store: WorkunitStore,
  name: String,
  initial_metadata: WorkunitMetadata,
  f: F,
  final_metadata_fn: M,
) -> F::Output
where
  E: Display,
  F: Future<Output=Result<T, E>>,
  M: for<'a> FnOnce(&'a F::Output, WorkunitMetadata) -> WorkunitMetadata,
{

witty-crayon-22786

11/16/2020, 9:45 PM

(ie, the result of the future must be a Result with an Error type that is Display)

witty-crayon-22786

11/16/2020, 9:45 PM

or whatever.

hundreds-breakfast-49010

11/16/2020, 10:00 PM

what would we do with that output?

hundreds-breakfast-49010

11/16/2020, 10:00 PM

workunits already have the ability to write arbitrary information to the log at ERROR level

witty-crayon-22786

11/16/2020, 10:03 PM

We'd change the level of the workunit for a failed Result to Error

hundreds-breakfast-49010

11/16/2020, 10:06 PM

we can already do that with the functionality to update

WorkunitMetadata

hundreds-breakfast-49010

11/16/2020, 10:06 PM

we could have a wrapper

with_result_workunit

that uses the existing

with_workunit

to do this, but I'm not sure that gets us anything

witty-crayon-22786

11/16/2020, 10:06 PM

Sure. This would just be shorthand for something that is a good idea to do in general

hundreds-breakfast-49010

11/16/2020, 10:07 PM

if there's a lot of places inthe code where we use

final_metadata_fn

to ony update the level and make no other changes, that seems like a reasonable shorthand to add

witty-crayon-22786

11/16/2020, 10:07 PM

Basically: about 99% of work units can fail, but probably almost none of them actually set an error level. This would make it easier to do.

hundreds-breakfast-49010

11/16/2020, 10:07 PM

we're typically setting the error level now with the

level

workunit aware method in python

hundreds-breakfast-49010

11/16/2020, 10:08 PM

which is eventually using that metadata-updating mechanism on

with_workunit

in some way

witty-crayon-22786

11/16/2020, 10:37 PM

right. the case i was thinking about is infrastructure errors. Tom had been using remote caching, and the traces that end up recorded don’t contain the infra errors that happen there

witty-crayon-22786

11/16/2020, 10:38 PM

these are the equivalent of exceptions, which rule code can’t handle currently… so it would never end up in the EngineAware API

hundreds-breakfast-49010

11/16/2020, 10:47 PM

we could definitely modify the workunits that get construct in the remote caching code to set their ERROR state appropriately today. and we could certainly add a new helper method around

with_workunit

in the process

witty-crayon-22786

11/16/2020, 10:53 PM

yea

witty-crayon-22786

11/16/2020, 10:54 PM

@fast-nail-55400: ^ftr: re workunits going to an error state

Open in Slack

Previous Next