Hi again! I'm planning to upstream today a little ...
# development
h
Hi again! I'm planning to upstream today a little backend toolchain made to aggregate all counters and histograms, then print them to the console at the end of the run. I think this should always be activated, and toggled with a bool option. What do you think for names?
Copy code
--stats-collector-enable
--stats-aggregator-enable
--debug-stats-enable
e
This used to roughly be
-X
for historical context.
đź‘Ť 1
h
Copy code
register('-x', '--time', type=bool,
             help='Output a timing report at the end of the run.')
h
--stats ? Something pithy would be good
h
@happy-kitchen-89482 as a global option? I'd prefer we do a subsystem I think, in part so we can add things in the future like a
--no-histograms
option (they're super chatty)
h
--stats-on
--stats-show
h
--stats-show
or
--exec-stats-show
? I get nervous claiming
[stats]
--exec-stats-log
?
h
I'd rather err on the side of clarity
what other stats will there be?
But why is this a plugin and not in core?
h
So far it's this:
Copy code
14:59:37.98 [INFO] Counters:
  local_cache_read_errors: 0
  local_cache_requests: 31
  local_cache_requests_cached: 8
  local_cache_requests_uncached: 23
  local_cache_write_errors: 0
  local_execution_requests: 23
  remote_cache_read_errors: 0
  remote_cache_requests: 0
  remote_cache_requests_cached: 0
  remote_cache_requests_uncached: 0
  remote_cache_speculation_local_completed_first: 0
  remote_cache_speculation_remote_completed_first: 0
  remote_cache_write_errors: 0
  remote_cache_write_finished: 0
  remote_cache_write_started: 0
  remote_execution_errors: 0
  remote_execution_requests: 0
  remote_execution_rpc_errors: 0
  remote_execution_rpc_execute: 0
  remote_execution_rpc_retries: 0
  remote_execution_rpc_wait_execution: 0
  remote_execution_success: 0
  remote_execution_timeouts: 0
It is in core, always activated! But it's a subsystem rather than global options:
I'd prefer we do a subsystem I think, in part so we can add things in the future like a --no-histograms option (they're super chatty)
h
Gotcha, my misunderstanding
Nonetheless, I'd prefer an easier scope than
exec-stats
, I think just
stats
is fine?
h
That precludes the global option
--stats
, which I'm fine with. But does that mess up something like
[black].stats
? I don't recall the rules. (I'll test if you don't remember)
@fast-nail-55400 is this what you're thinking for the histogram summary?
Copy code
19:18:51.74 [INFO] Histogram for `local_store_read_blob_size`:
  min: 10
  max: 3069951
  mean: 115912.684
  std dev: 515510.627
  p25: 79
  p50: 91
  p75: 166
  p90: 1644
  p95: 468479
  p99: 3069951
Anything else pertinent like "num buckets"? cc @witty-crayon-22786
f
is median available?
nm that’s just P50 technically
đź‘Ť 1
total number of samples would be useful
đź‘Ť 1
but otherwise lgtm
also “Histogram for” should probably be “Summary of” since this no longer outputs a histogram per se
đź‘Ť 1
h
Copy code
19:24:27.52 [INFO] Summary of `local_store_read_blob_size` observation histogram:
  min: 10
  max: 3069951
  mean: 115914.174
  std dev: 515510.321
  total observations: 190
  p25: 79
  p50: 89
  p75: 166
  p90: 1644
  p95: 468479
  p99: 3069951
w
Would drop std dev, but other than that looks good!
h
how come? hdrhistogram's summary included it. (I have no opinion)
w
And would continue to vote for just inlining into the counter values with a compound key:
Copy code
19:24:27.52 [INFO] local_store_read_blob_size:p50: 89
đź‘€ 1
Um... Just that I've never found it useful?
h
And would continue to vote for just inlining into the counter values with a compound key:
I think that gets too noisy. We have 28 counters, each histogram has 11 values associated with it. So adding 1 histogram means now 39 values in a flat list, ~30% corresponding to only a single metric