we’ve discussed having a result or “errored” state...
# development
w
we’ve discussed having a result or “errored” state on workunits before, but we don’t currently have that
it’s probably worth adding to help make traces more useful
there are conventions in some of the distributed tracing frameworks for how to mark spans as having an error associated: these docs are about zipkin, where having an
error
property will turn the span “red” it looks like: https://docs.newrelic.com/docs/understand-dependencies/distributed-tracing/trace-api/trace-api-decorate-spans-attributes
cc @hundreds-breakfast-49010, @polite-garden-50641
h
we currently have a distinction between in-progress and completed workunits, that we added in the context of implementing workunit streaming
w
yep
h
and we have the ability to set a per-workunit debug level, which we use for error cases sometimes
I'm a little worried about adding additional state to workunits that does similar things to state on workunits that already exists
w
oh, true. so maybe lifting the level to Error, and changing the description would be enough?
h
what's the difference in principle between a workunit with ERROR level and a workunit with some kind of error marking?
(that doesn't exist yet)
w
there might not be one. it’s just different conventions in various systems
so then i think maybe what might make sense would be to have
with_workunit
have a variant that expects a
Result
, and can add that information to the workunit automatically…?
not sure exactly what the signature would be, but maybe:
Copy code
pub async fn with_result_workunit<F, M>(
  workunit_store: WorkunitStore,
  name: String,
  initial_metadata: WorkunitMetadata,
  f: F,
  final_metadata_fn: M,
) -> F::Output
where
  E: Display,
  F: Future<Output=Result<T, E>>,
  M: for<'a> FnOnce(&'a F::Output, WorkunitMetadata) -> WorkunitMetadata,
{
(ie, the result of the future must be a Result with an Error type that is Display)
or whatever.
h
what would we do with that output?
workunits already have the ability to write arbitrary information to the log at ERROR level
w
We'd change the level of the workunit for a failed Result to Error
h
we can already do that with the functionality to update
WorkunitMetadata
we could have a wrapper
with_result_workunit
that uses the existing
with_workunit
to do this, but I'm not sure that gets us anything
w
Sure. This would just be shorthand for something that is a good idea to do in general
h
if there's a lot of places inthe code where we use
final_metadata_fn
to ony update the level and make no other changes, that seems like a reasonable shorthand to add
w
Basically: about 99% of work units can fail, but probably almost none of them actually set an error level. This would make it easier to do.
h
we're typically setting the error level now with the
level
workunit aware method in python
which is eventually using that metadata-updating mechanism on
with_workunit
in some way
w
right. the case i was thinking about is infrastructure errors. Tom had been using remote caching, and the traces that end up recorded don’t contain the infra errors that happen there
these are the equivalent of exceptions, which rule code can’t handle currently… so it would never end up in the EngineAware API
h
we could definitely modify the workunits that get construct in the remote caching code to set their ERROR state appropriately today. and we could certainly add a new helper method around
with_workunit
in the process
w
yea
@fast-nail-55400: ^ftr: re workunits going to an error state