https://pantsbuild.org/ logo
#development
Title
# development
h

hundreds-father-404

02/24/2021, 9:18 PM
Hi again! I'm planning to upstream today a little backend toolchain made to aggregate all counters and histograms, then print them to the console at the end of the run. I think this should always be activated, and toggled with a bool option. What do you think for names?
Copy code
--stats-collector-enable
--stats-aggregator-enable
--debug-stats-enable
e

enough-analyst-54434

02/24/2021, 9:32 PM
This used to roughly be
-X
for historical context.
👍 1
h

hundreds-father-404

02/24/2021, 9:33 PM
Copy code
register('-x', '--time', type=bool,
             help='Output a timing report at the end of the run.')
h

happy-kitchen-89482

02/24/2021, 10:15 PM
--stats ? Something pithy would be good
h

hundreds-father-404

02/24/2021, 10:17 PM
@happy-kitchen-89482 as a global option? I'd prefer we do a subsystem I think, in part so we can add things in the future like a
--no-histograms
option (they're super chatty)
h

happy-kitchen-89482

02/24/2021, 10:17 PM
--stats-on
--stats-show
h

hundreds-father-404

02/24/2021, 10:25 PM
--stats-show
or
--exec-stats-show
? I get nervous claiming
[stats]
--exec-stats-log
?
h

happy-kitchen-89482

02/24/2021, 10:27 PM
I'd rather err on the side of clarity
what other stats will there be?
But why is this a plugin and not in core?
h

hundreds-father-404

02/24/2021, 10:28 PM
So far it's this:
Copy code
14:59:37.98 [INFO] Counters:
  local_cache_read_errors: 0
  local_cache_requests: 31
  local_cache_requests_cached: 8
  local_cache_requests_uncached: 23
  local_cache_write_errors: 0
  local_execution_requests: 23
  remote_cache_read_errors: 0
  remote_cache_requests: 0
  remote_cache_requests_cached: 0
  remote_cache_requests_uncached: 0
  remote_cache_speculation_local_completed_first: 0
  remote_cache_speculation_remote_completed_first: 0
  remote_cache_write_errors: 0
  remote_cache_write_finished: 0
  remote_cache_write_started: 0
  remote_execution_errors: 0
  remote_execution_requests: 0
  remote_execution_rpc_errors: 0
  remote_execution_rpc_execute: 0
  remote_execution_rpc_retries: 0
  remote_execution_rpc_wait_execution: 0
  remote_execution_success: 0
  remote_execution_timeouts: 0
It is in core, always activated! But it's a subsystem rather than global options:
I'd prefer we do a subsystem I think, in part so we can add things in the future like a --no-histograms option (they're super chatty)
h

happy-kitchen-89482

02/24/2021, 11:42 PM
Gotcha, my misunderstanding
Nonetheless, I'd prefer an easier scope than
exec-stats
, I think just
stats
is fine?
h

hundreds-father-404

02/24/2021, 11:57 PM
That precludes the global option
--stats
, which I'm fine with. But does that mess up something like
[black].stats
? I don't recall the rules. (I'll test if you don't remember)
@fast-nail-55400 is this what you're thinking for the histogram summary?
Copy code
19:18:51.74 [INFO] Histogram for `local_store_read_blob_size`:
  min: 10
  max: 3069951
  mean: 115912.684
  std dev: 515510.627
  p25: 79
  p50: 91
  p75: 166
  p90: 1644
  p95: 468479
  p99: 3069951
Anything else pertinent like "num buckets"? cc @witty-crayon-22786
f

fast-nail-55400

02/25/2021, 2:21 AM
is median available?
nm that’s just P50 technically
👍 1
total number of samples would be useful
👍 1
but otherwise lgtm
also “Histogram for” should probably be “Summary of” since this no longer outputs a histogram per se
👍 1
h

hundreds-father-404

02/25/2021, 2:24 AM
Copy code
19:24:27.52 [INFO] Summary of `local_store_read_blob_size` observation histogram:
  min: 10
  max: 3069951
  mean: 115914.174
  std dev: 515510.321
  total observations: 190
  p25: 79
  p50: 89
  p75: 166
  p90: 1644
  p95: 468479
  p99: 3069951
w

witty-crayon-22786

02/25/2021, 2:37 AM
Would drop std dev, but other than that looks good!
h

hundreds-father-404

02/25/2021, 2:38 AM
how come? hdrhistogram's summary included it. (I have no opinion)
w

witty-crayon-22786

02/25/2021, 2:38 AM
And would continue to vote for just inlining into the counter values with a compound key:
Copy code
19:24:27.52 [INFO] local_store_read_blob_size:p50: 89
👀 1
Um... Just that I've never found it useful?
h

hundreds-father-404

02/25/2021, 2:45 AM
And would continue to vote for just inlining into the counter values with a compound key:
I think that gets too noisy. We have 28 counters, each histogram has 11 values associated with it. So adding 1 histogram means now 39 values in a flat list, ~30% corresponding to only a single metric