< hundreds father 404> re `Starting` messages for test proce Pants #development

<@UB2J9BQA0>: re: `Starting` messages for test pro...

witty-crayon-22786

09/28/2020, 8:52 PM

@hundreds-father-404: re:

Starting

messages for test processes

witty-crayon-22786

09/28/2020, 8:52 PM

not critical, but they were useful to have in CI

witty-crayon-22786

09/28/2020, 8:55 PM

it sounds like maybe we mark the

ProcessResult

EngineAware

, and then silence the workunits created by the

Process

itself

witty-crayon-22786

09/28/2020, 8:57 PM

…mm. ok, it’s higher level than that:

Copy code

EnrichedTestResult

👍 1

hundreds-father-404

09/28/2020, 8:58 PM

I think that the

EngineAwareParameterType

might work. Set it on the

PythonTestFieldSet

witty-crayon-22786

09/28/2020, 8:58 PM

and we can’t rely on the

@rule

that produces that workunit to tell us when it has started, for two reasons: 1) it’s not under the process lock so it doesn’t know that it has actually started, 2) the

@rule

won’t know its own identity when it starts

hundreds-father-404

09/28/2020, 8:59 PM

the @rule won’t know its own identity when it starts

Oh, yeah, I think

EngineAwareParameter

only impacts some things like stacktraces, but not the “Starting” message

hundreds-father-404

09/28/2020, 9:00 PM

+1 also that what we really want is to say “starting” when it’s the subprocess. Not the rule.

witty-crayon-22786

09/28/2020, 9:00 PM

the Params probably should be usable in the workunit description via templating, yea. but still have to worry about (1)

hundreds-father-404

09/28/2020, 9:01 PM

This is probably a design smell, but might we want something like?

Copy code

starting_level=<http://LogLevel.INFO|LogLevel.INFO>,
ending_level=LogLevel.DEBUG,

The

Process

constructor can take

level

starting_level

and

ending_level

. If you only do

level

, then it will automatically set the other two.

witty-crayon-22786

09/28/2020, 9:02 PM

the bounded commander runner lock is basically in the optimal spot right now, (un)fortunately. if we moved it “higher” so that preparatory rules didn’t even start running until there were slots available, we’d potential block preparatory work from starting, and then only run it under the lock

👍 1

witty-crayon-22786

09/28/2020, 9:04 PM

https://pantsbuild.slack.com/archives/C0D7TNJHL/p1601326901051300?thread_ts=1601326361.048700&cid=C0D7TNJHL maybe… unclear. it’s really nice that explicit log messages are mostly not necessary because the workunit ends up containing this info. but this is a sticky problem, because only the `Process`’s workunit(s) know when it is actually running. cc @hundreds-breakfast-49010

hundreds-breakfast-49010

09/28/2020, 9:06 PM

@hundreds-father-404 we already have a mechanism for altering the level of a workunit while it is running

hundreds-breakfast-49010

09/28/2020, 9:06 PM

it's one of the engineaware methods

witty-crayon-22786

09/28/2020, 9:06 PM

@hundreds-breakfast-49010: yes, but see the other context in the thread

hundreds-breakfast-49010

09/28/2020, 9:07 PM

what's the problem we're trying to solve with

Starting

messages?

witty-crayon-22786

09/28/2020, 9:07 PM

https://pantsbuild.slack.com/archives/C0D7TNJHL/p1601326374048800?thread_ts=1601326361.048700&cid=C0D7TNJHL

hundreds-breakfast-49010

09/28/2020, 9:07 PM

did we remove starting messages somehow?

witty-crayon-22786

09/28/2020, 9:08 PM

that’s described at the top of the thread

witty-crayon-22786

09/28/2020, 9:08 PM

the

Process

is marked debug

hundreds-father-404

09/28/2020, 9:08 PM

but this is a sticky problem, because only the Process’s workunit(s) know when it is actually running

I think that keeping the log message on the

Process

is correct. The

Process

also has access to the information it needs: the file/target it’s running on Imo, it seems like this is about a need to have the

Process

be INFO at its start, but DEBUG at completion

witty-crayon-22786

09/28/2020, 9:08 PM

and the rules above it are also. they only lift their levels as they complete

👍 1

hundreds-breakfast-49010

09/28/2020, 9:09 PM

this sounds like we're trying to use the concept of a workunit's level for something it wasn't intended for

hundreds-breakfast-49010

09/28/2020, 9:09 PM

it sounds like a workunit isn't actually the level of granularity we want for logging information about it

hundreds-breakfast-49010

09/28/2020, 9:10 PM

because we're trying to tune specific workunits by setting their level to something else, and then realizing that this is silencing other useful logging messages

witty-crayon-22786

09/28/2020, 9:10 PM

@hundreds-father-404: i think that i agree. i wonder if pushing

EngineAwareReturnType

down far enough would allow us to actually configure completion info for the

Process

witty-crayon-22786

09/28/2020, 9:12 PM

@hundreds-breakfast-49010: i’m not sure that they’re the wrong tool yet, per-se… because at a fundamental level, it is the start and end of the process itself that we care about here

witty-crayon-22786

09/28/2020, 9:12 PM

but we can’t push the information/logic that we want low enough to get it attached to the Process’s workunit

hundreds-breakfast-49010

09/28/2020, 9:13 PM

what information/logic is that? I'm still not sure exactly what change happened

hundreds-father-404

09/28/2020, 9:13 PM

would allow us to actually configure completion info for the Process ?

I don’t think we want this. Right now, streaming of test results is done in

test.py

. When you’re adding a new test implementation, you’ll get it working correctly no matter what. That’s important

hundreds-breakfast-49010

09/28/2020, 9:13 PM

we made

Process

workunits DEBUG instead of INFO, and that got rid of the "Starting" message for

Process

workunits?

👍 1

witty-crayon-22786

09/28/2020, 9:13 PM

@hundreds-breakfast-49010: test completion is logged via https://github.com/pantsbuild/pants/blob/f90b9826f064af61556c92e4bbc573861ee0b9fe/src/python/pants/core/goals/test.py#L75 right now

witty-crayon-22786

09/28/2020, 9:14 PM

and the processes were moved to Debug, so you don’t get logs for them starting/finishing, but you do see them in the dynamic UI

hundreds-breakfast-49010

09/28/2020, 9:16 PM

why were they moved to Debug?

hundreds-father-404

09/28/2020, 9:16 PM

To clarify, the reason we moved the

Process

to Debug is that it’s really noisy for us to say

Completed process

and then immediately after stream the result via

test.py

, which uses

EngineAwareReturnType

hundreds-father-404

09/28/2020, 9:17 PM

But, in a perfect world, we would still say “Starting process”. Just not the “Completed” part

hundreds-breakfast-49010

09/28/2020, 9:17 PM

why's that? that seems like two conceptually-separate messages

hundreds-breakfast-49010

09/28/2020, 9:17 PM

"pants finished executing some process" and "here is the result of a test"

witty-crayon-22786

09/28/2020, 9:17 PM

@hundreds-father-404: but also, only if it failed, right?

witty-crayon-22786

09/28/2020, 9:18 PM

maybe the answer is to move test Processes back to info, and just have redundancy only for failed tests?

witty-crayon-22786

09/28/2020, 9:19 PM

i think that Benjy made the swap to “only dump output on failure”, and that has been a good default.

hundreds-father-404

09/28/2020, 9:19 PM

“pants finished executing some process” and “here is the result of a test”

From a user perspective, the implementation detail would be leaking imo. There isn’t any new informtation gained by saying “completed process” and then immediately after “here’s the process output” Taking a step back from the current mechanisms, the user cares about this when

--no-dynamic-ui

is set: * When the test run starts. * When the test run finishes, and what its status + output is, where output depends on the

--test-show-output

option * Not too noisy.

hundreds-father-404

09/28/2020, 9:20 PM

but also, only if it failed, right?

No, we still need to say “test succeeded” vs. “test failed”. The

Process

will only say “test finished running” but zero insight on what the result was

witty-crayon-22786

09/28/2020, 9:20 PM

* When the test run finishes, and what its status + output is, where output depends on the
--test-show-output
option

yes, but the default is “only on failure”, in which case you get a ton of output, and the extra line is a non-issue i think.

witty-crayon-22786

09/28/2020, 9:21 PM

ah. dang. yea.

hundreds-breakfast-49010

09/28/2020, 9:23 PM

I'm not sure the process bit of it is an implementation detail. users care about when the test run starts because they care about when the test process actually runs

hundreds-breakfast-49010

09/28/2020, 9:23 PM

which means they should care about when it stops running too, regardless of outcome

hundreds-breakfast-49010

09/28/2020, 9:23 PM

that's useful information about how long it takes to run one test process

hundreds-breakfast-49010

09/28/2020, 9:23 PM

if users didn't care about that, we could just make the entire

Process

workunit DEBUG and not worry about when a given test started

hundreds-father-404

09/28/2020, 9:25 PM

which means they should care about when it stops running too, regardless of outcome

Yes, they do care about this. Strong agreement. But they already get that via

142421.68 [INFO] Completed: Run tests - src/python/pants/util/strutil_test.py:tests succeeded.

It would be confusing to get that and a message saying “Completed: Run tests for src/python/pants/util/strutil_tests.py:tests`. Why are you getting both? They say basically the same thing, but one is less helpful.

hundreds-breakfast-49010

09/28/2020, 9:26 PM

the second one is the workunit for the

@rule

right?

witty-crayon-22786

09/28/2020, 9:26 PM

so: how do we push down richer information or logic to the Process

hundreds-father-404

09/28/2020, 9:27 PM

so: how do we push down richer information or logic to the Process

Which I don’t think we want from a Plugin API perspective https://pantsbuild.slack.com/archives/C0D7TNJHL/p1601327586055900?thread_ts=1601326361.048700&cid=C0D7TNJHL Perhaps there is a way to make it more automatic though

witty-crayon-22786

09/28/2020, 9:43 PM

@hundreds-father-404: so, ignoring the “tests” case for now (remains an open question… i’ll file a ticket), it looks like a lot of processes got moved to Debug at various times

witty-crayon-22786

09/28/2020, 9:43 PM

protobuf, code coverage, etc.

witty-crayon-22786

09/28/2020, 9:44 PM

i think that if we don’t have a particular

@rule

rendering for those (as in the test/lint/etc case), we should have those at info so that they get log output.

witty-crayon-22786

09/28/2020, 9:46 PM

i’m currently looking at the CI issues that Asher/Tom experienced, and we don’t have much to show how long things took. (i’m still thinking that logging “some of” what the UI renders would be a good idea, but i don’t think it’s a replacement for the Started/Completed messages in most case)

witty-crayon-22786

09/28/2020, 9:47 PM

we only log processes if they actually ran (ie, didn’t hit the process cache) so they’re not particularly noisy i don’t think.

hundreds-father-404

09/28/2020, 9:48 PM

I personally disagree. For example, Protobuf gets run dozens of times and I think it would be really noisy to get every single run. I do think though that the desired behavior changes between CI vs. locally. Locally, where you have a smaller terminal, I’m personally extra sensitive to noisy tools. You also already have the dynamic UI to show what’s going on. But in CI, it is more helpful to get a tasteful amount of logging

witty-crayon-22786

09/28/2020, 9:48 PM

For example, Protobuf gets run dozens of times and I think it would be really noisy to get every single run.

only if the files have changed

witty-crayon-22786

09/28/2020, 9:49 PM

i think that the variability here is automatically handled by the fact that you hit the cache for these processes… they’re only logged if they’re actually run, and that should only be your first run locally (per branch or w/e)

witty-crayon-22786

09/28/2020, 9:51 PM

on your second run, if you’ve edited a protobuf file, then yes: you’ll see logging for the compiler running. that’s probably good though.

hundreds-father-404

09/28/2020, 9:51 PM

and that should only be your first run locally

I do think that we need to consider your very first run for Pants, though. Are you going to get 50 lines of logs that you have to parse through?

witty-crayon-22786

09/28/2020, 9:54 PM

yes, and then (almost) never again.

witty-crayon-22786

09/28/2020, 9:54 PM

(…tool/branch changes, etc.)

witty-crayon-22786

09/28/2020, 9:56 PM

But I suppose that Bazel doesn't actually render a line per process either. Hm.

witty-crayon-22786

09/28/2020, 9:59 PM

alright. i’ll just look into seeing what can play the role of

--dynamic-ui

when it is disabled, and then we’ll see where we are afterwards.

👍 1

hundreds-father-404

09/28/2020, 10:10 PM

Yeah I think my takeaway from this convo is that

--no-dynamic-ui

is more important than I thought. We don’t only want it to toggle the “Starting” message. For many workunits, we also want it to toggle the “Completed” message

hundreds-father-404

09/28/2020, 10:11 PM

That’s part of the reason with things like Protoc I’m opposed to promoting to INFO. When you have the dynamic UI, you already see that Protoc finished; I don’t think it’s important enough to persist in your terminal that it finished 12 times But, if you didn’t have the dynamic UI to see that it finished, such as in CI, then it is more useful to know that.

witty-crayon-22786

09/28/2020, 10:28 PM

yea, i’m unclear on what the strategy for

--no-dynamic-ui

will be (tweaking level vs finding a useful periodic thing that we can dump), but i think that i agree.

hundreds-father-404

09/28/2020, 10:32 PM

The strawman design that’s been floating in my head is to add a new type that allows you to customize how things appear in each context:

{dynamic ui x no_dynamic_ui} x {starting_level x ending_level}

The constructor would be ergonomic, such that you can only set

level=LogLevel.DEBUG

for example and it will do that for everything. Or, you can get fancier if you want tight control, like we want with

test

. What’s holding me back is that logging is already a little tricky to reason about. And it’s really easy for someone to accidentally change something and break things without realizing. This adds lots of permutations But then on the other side of my internal debate, I think it’s crucial we have an excellent UX for every edge case. Even if our code has to be complex to support it, I think that it’s worth it so that

--dynamic-ui

isn’t too noisy, but also

--no-dynamic-ui

is informative enough.

witty-crayon-22786

09/28/2020, 10:36 PM

i’m hoping that there is something clearly useful that we can log in the

--no-dynamic-ui

case that aligns the on/off case so that individual nodes don’t have to worry about that.

witty-crayon-22786

09/28/2020, 10:36 PM

but yea, we’ll see.

Open in Slack

Previous Next