mostly unrelated to the above < hundreds breakfast 49010> re Pants #development

mostly unrelated to the above. <@UKPS08ZGW>: re <h...

witty-crayon-22786

11/25/2019, 7:20 PM

mostly unrelated to the above. @hundreds-breakfast-49010: re https://github.com/gshuflin/pants/commit/a570414cbdbddfd995f50d2aa90df7ea6050d72a and exceptions:

witty-crayon-22786

11/25/2019, 7:23 PM

i expect that one of the challenges will be that there are two cases: 1) if a Node has

waiters

while it is running, when it fails it's pretty easy to send the failure to them directly, 2) but for requests after having failed, you cannot have "Complete"d the

Node

, otherwise it won't run again

witty-crayon-22786

11/25/2019, 7:23 PM

so i think that what you were doing in https://github.com/gshuflin/pants/commit/87eea13575c1791495dc07d804b7926e30665c61 is closer to what you need

hundreds-breakfast-49010

11/25/2019, 7:24 PM

what exactly is the

waiters

list meant to represent?

witty-crayon-22786

11/25/2019, 7:24 PM

basically: if a

Node

fails, it cannot "Complete": it must go back to

NotStarted

witty-crayon-22786

11/25/2019, 7:25 PM

your second edit tries to do everything in

get

, but for anyone trying to

get

the Node later, you need to start running again, and the right way to do that is probably just to transition right to

NotStarted

rather than Completing

witty-crayon-22786

11/25/2019, 7:25 PM

@hundreds-breakfast-49010: look at the docstring on the struct (EDIT: sorry, nothing useful there. see below.)

witty-crayon-22786

11/25/2019, 7:26 PM

(it's roughly: other Nodes/Futures that are waiting for this Node to complete)

hundreds-breakfast-49010

11/25/2019, 7:29 PM

that makes sense

hundreds-breakfast-49010

11/25/2019, 7:29 PM

the issue I ran into with that earlier commit

witty-crayon-22786

11/25/2019, 7:29 PM

@hundreds-breakfast-49010: i would say that probably the clearest way to guide this refactoring would be to change the type stored in

Completed

and

EntryResult

from a

Result<Item, Error>

to just

Item

witty-crayon-22786

11/25/2019, 7:29 PM

because the goal is approximately: "Nodes never store a failure"

witty-crayon-22786

11/25/2019, 7:29 PM

and so the typesystem can help here.

witty-crayon-22786

11/25/2019, 7:30 PM

or maybe just changing the thing inside

Completed

to an

Item

...?

witty-crayon-22786

11/25/2019, 7:30 PM

one of those.

hundreds-breakfast-49010

11/25/2019, 7:30 PM

was that if I just dropped down to that

NotStarted

case when a rule threw an exception, I would see that deliberate exception I put in the

create_binary_rule

run about eight times, which seemed excessive

hundreds-breakfast-49010

11/25/2019, 7:31 PM

but now that i think about it, I suspect that's what we want

hundreds-breakfast-49010

11/25/2019, 7:31 PM

since I think we expect that most exceptions that rules throw in practice will be nondeterministic

witty-crayon-22786

11/25/2019, 7:31 PM

yea, unfortunately: that's what we're asking for.

hundreds-breakfast-49010

11/25/2019, 7:31 PM

so, if, during the graph traversal process, we get a node that throws, we set it to

NotComplete

witty-crayon-22786

11/25/2019, 7:32 PM

i continue to be uncomfortable doing this, to be clear. avoiding non-determinism would be a better goal. cc @happy-kitchen-89482

hundreds-breakfast-49010

11/25/2019, 7:32 PM

and then anything upstream of it that needs it will just request that node, since it's

NotComplete

hundreds-breakfast-49010

11/25/2019, 7:32 PM

and it's only in this artificial case where I'm deliberately always throwing that I see the rule run and the exception thrown multiple times

witty-crayon-22786

11/25/2019, 7:32 PM

but if we're certain we need to, then that would be the way.

hundreds-breakfast-49010

11/25/2019, 7:33 PM

@witty-crayon-22786 benjy's argument is that we can't guarantee that users will be perfect about writing rules that don't throw, and we don't want to poison the cache with the failed results of these rules

hundreds-breakfast-49010

11/25/2019, 7:34 PM

anyway, if, hypothetically, we were able to enforce that every python

@rule

was completely deterministic, I think it would still be fine to change the type in the way you describe

witty-crayon-22786

11/25/2019, 7:35 PM

we can by making it very hard to do the wrong thing

hundreds-breakfast-49010

11/25/2019, 7:35 PM

because in that hypothetical scenario we would either statically enforce that a rule always returns something useful, or always fails, which would be useless

witty-crayon-22786

11/25/2019, 7:35 PM

caching exceptions makes it hard to do the wrong thing

witty-crayon-22786

11/25/2019, 7:36 PM

https://pantsbuild.slack.com/archives/C0D7TNJHL/p1574710488241400?thread_ts=1574709624.236300&cid=C0D7TNJHL yea, agreed on this.

hundreds-breakfast-49010

11/25/2019, 7:36 PM

"caching exceptions makes it hard to do the wrong thing" <- I'm not sure I understand the logic behind this, and I think that's becuase I'm not clear on what does and does not constitute "the wrong thing"

witty-crayon-22786

11/25/2019, 7:37 PM

the reason you care about caching exceptions at all is that you are trying to do something non-deterministic in a rule, correct?

witty-crayon-22786

11/25/2019, 7:37 PM

if you weren't trying to do something non-deterministic, it would be a tree falling in a forest.

hundreds-breakfast-49010

11/25/2019, 7:39 PM

yeah, I think most of these cases are in having rules that make network requests, which we don't yet have an intrinsic for

witty-crayon-22786

11/25/2019, 7:39 PM

for precisely this reason.

hundreds-breakfast-49010

11/25/2019, 7:40 PM

but even if we do implement that intrinsic, there will be other things we're not thinking about at the moment, that prove to be nondeterministic

hundreds-breakfast-49010

11/25/2019, 7:40 PM

or maybe things we think are deterministic but turn out not to be rarely

witty-crayon-22786

11/25/2019, 7:40 PM

the thing about network requests is that in most cases they are non-deterministic even when they succeed

hundreds-breakfast-49010

11/25/2019, 7:41 PM

yeah, we definitely do want an intrinsic for network requests, we just don't have one yet (I was starting to work on one and maybe should jump back to it in the very near future)

witty-crayon-22786

11/25/2019, 7:43 PM

it's all tied up in the same question, so doing a bit of design for the entire question/problem ("we'd like a to run a potentially non-determinstic resolve codepath") that would be good

witty-crayon-22786

11/25/2019, 7:43 PM

because i think that i've suggested breaking your resolver out into a pex/separate process, and i think that i would still recommend that.

witty-crayon-22786

11/25/2019, 7:44 PM

because non-determinism of processes is a pattern we already expect to need to account for (due to flaky tests, resolver network access, etc)

hundreds-breakfast-49010

11/25/2019, 7:47 PM

it sounds like you're suggesting that anything that pants does that might be legitimately nondeterministic should be put into a separate process so it can be run with ExecuteProcessRequest, which already handles process nondeterminism?

witty-crayon-22786

11/25/2019, 7:47 PM

that's roughly where we are right now, yes.

witty-crayon-22786

11/25/2019, 7:47 PM

none of the built in rules are non-deterministic: only those intrinsics (ExecuteProcessRequest and UrlToFetch)

witty-crayon-22786

11/25/2019, 7:48 PM

this also connects to the retry question.

hundreds-breakfast-49010

11/25/2019, 7:50 PM

if I remember correctly we had a draft design for the retry question and then were able to work around it for tests so it got tabled

hundreds-breakfast-49010

11/25/2019, 7:51 PM

but yeah the broader question is definitely tied into the same notion of correctly handling nondeterminism

hundreds-breakfast-49010

11/25/2019, 7:52 PM

benjy's going to say that no matter what we do, we can't perfectly enforce that no user ever creates a non-deterministic

@rule

(although as you say we can make it harder to do the wrong thing)

witty-crayon-22786

11/25/2019, 7:52 PM

we can with the right sandboxing. certainly if we banned/whitelisted stdlib imports, it would go a long way.

witty-crayon-22786

11/25/2019, 7:53 PM

(bazel went so far as to use starlark, which was likely "too much" sandboxing... or at least the wrong compatibility tradeoff)

hundreds-breakfast-49010

11/25/2019, 7:55 PM

right now there's nothing stopping a rule-writer from importing arbitrary python into their rules, right?

hundreds-breakfast-49010

11/25/2019, 7:56 PM

like, if someone writes a plugin that provides some rules, they can import their own library, right?

hundreds-breakfast-49010

11/25/2019, 7:56 PM

and use that library code in their rules?

hundreds-breakfast-49010

11/25/2019, 7:56 PM

if we allow that then there's no way we can in principle guarantee that no exception will be thrown in a rule

hundreds-breakfast-49010

11/25/2019, 7:57 PM

I think even if we strictly whitelist std library imports that will be really hard to get right

witty-crayon-22786

11/25/2019, 7:58 PM

sure. but whitelisting certain stdlib imports is definitely a thing that we should discuss doing.

happy-kitchen-89482

11/25/2019, 8:12 PM

I understand your discomfort with this (and I will read this thread in detail for more context). I agree that the topline has to be "rules must be deterministic", and I'm confident we can make our rules adhere to that. I'm just concerned about balancing that out with a concession to human error, especially with in-repo custom rules at various orgs.

witty-crayon-22786

11/25/2019, 8:13 PM

@happy-kitchen-89482: what about leaning in on import whitelisting instead as a way to solve the same problem?

witty-crayon-22786

11/25/2019, 8:14 PM

the idea would be to guide folks toward using external processes to isolate non-determinism

hundreds-breakfast-49010

11/25/2019, 8:20 PM

we can build abstractions within pants to isolate non-determinism other than ExecutePorcessRequest

hundreds-breakfast-49010

11/25/2019, 8:20 PM

that's what the network request intrinsic would be (or any other similar intrinsics we wrote)

witty-crayon-22786

11/25/2019, 8:27 PM

Sortof: it depends whether the network request intrinsic would need to expose failure to @rules

hundreds-breakfast-49010

11/25/2019, 8:30 PM

hm, actually, right now requesting an EPR can fail, right?

hundreds-breakfast-49010

11/25/2019, 8:30 PM

and that failure is manifested as a python exception, right?

hundreds-breakfast-49010

11/25/2019, 8:31 PM

yeah we have that

ProcessExecutionFailure

exception

hundreds-breakfast-49010

11/25/2019, 8:33 PM

this rule for instance: https://github.com/pantsbuild/pants/blob/6e2289175316ba6a2c4d6917389acf3ae67be936/src/python/pants/engine/isolated_process.py#L170 deliberately raises an exception pants itself defines

hundreds-breakfast-49010

11/25/2019, 8:34 PM

just like any other python exception in a rule, so a client of this rule wouldn't even be able to manually catch that exception, you just have to be aware of the fact that if you request an

ExecuteProcessResult

from a

FallibleExecuteProcessResult

it might fail

hundreds-breakfast-49010

11/25/2019, 8:34 PM

so maybe this is something we should get rid of from the pants codebase, but this is rule-nondeterminism we already have that we put there

witty-crayon-22786

11/25/2019, 8:36 PM

yes. But the question is whether we want to encourage more and further expose it to users, or to isolate it to the process usecase and see if that makes things simpler

happy-kitchen-89482

11/26/2019, 5:17 PM

Just to clarify to myself: It should always be correct to not memoize something, right? It might impact performance but never correctness.

witty-crayon-22786

11/26/2019, 5:32 PM

correct

hundreds-breakfast-49010

11/27/2019, 12:57 AM

@witty-crayon-22786 one thing @happy-kitchen-89482 suggested was that, maybe we want failure-memoization to be something toggleable on a per-rule basis

hundreds-breakfast-49010

11/27/2019, 12:58 AM

i.e. we would default to our current behavior, of caching rust

Err

values representing an exception-throwing

@rule

, but if we had a rule we knew to be flaky (so, assuming that most of the time most `@rule`s are deterministic), we could add an argument marking the rule as such

happy-kitchen-89482

11/27/2019, 1:57 AM

I think this would really help while developing new rules in user repos, while keeping all core pants rules, and most others, stricter.

hundreds-breakfast-49010

12/03/2019, 6:19 PM

@witty-crayon-22786 re-pinging since it was before thanksgiving since the last time this thread was active - do you think an option on `@rule`s to control whether or not thrown exceptions get memoized is a reasonable way to handle flaky rules?

witty-crayon-22786

12/03/2019, 6:57 PM

That's one potential UX for this case, but I don't think it makes the problem simpler.

witty-crayon-22786

12/03/2019, 6:57 PM

Might make it more complex.

witty-crayon-22786

12/03/2019, 6:59 PM

I think that I'd personally rather just remove memoization of exceptions entirely, and do something else to discourage non-determinism (some sort of basic symbol banning/inclusion/exclusion list, if we can think of a way to do that)

Open in Slack

Previous Next