hundreds-father-404
09/14/2019, 6:20 PMExecuteProcessRequest
called max_number_attempts
. When the ExecuteProcessResult
has an exit code != 0, have engine retry up to n times?
Would we surface any mechanism for retry predicates, ie only retry it condition x is met? Or could we keep it simple and retry whenever the return code != 0aloof-angle-91616
09/14/2019, 6:25 PM--loop
since we just need pants to manage restarting it, and we're not using that as part of another rulehundreds-father-404
09/14/2019, 6:26 PMaloof-angle-91616
09/14/2019, 6:26 PM@console_rule
invocationaloof-angle-91616
09/14/2019, 6:27 PMFallibleExecuteProcessResult
insteadaloof-angle-91616
09/14/2019, 6:28 PMaloof-angle-91616
09/14/2019, 6:30 PMCommandRunner
(cc @early-needle-54791), although i don't know if that's the right level of abstraction to do it ataloof-angle-91616
09/14/2019, 6:30 PMaloof-angle-91616
09/14/2019, 6:32 PMaloof-angle-91616
09/14/2019, 6:33 PMaloof-angle-91616
09/14/2019, 6:33 PMhundreds-father-404
09/14/2019, 6:43 PMFallibleExecuteProcessResult
still cache the failure? That’s the main issue we need to workaround. Almost home and can checkaloof-angle-91616
09/14/2019, 6:43 PMaloof-angle-91616
09/14/2019, 6:44 PMhundreds-father-404
09/14/2019, 6:45 PMExecuteProcessResult
. Were we to stop doing this, then we’d be safe to have flakes because we could restart the CI shard and get different results. The issue now is that restarting the shard doesn’t change anything due to the cachealoof-angle-91616
09/14/2019, 6:45 PMaloof-angle-91616
09/14/2019, 6:45 PMaloof-angle-91616
09/14/2019, 6:45 PMaloof-angle-91616
09/14/2019, 6:45 PMaloof-angle-91616
09/14/2019, 6:46 PMaloof-angle-91616
09/14/2019, 6:46 PMaloof-angle-91616
09/14/2019, 6:46 PMaloof-angle-91616
09/14/2019, 6:47 PMhundreds-father-404
09/14/2019, 6:47 PMadditionally, in the persistent process execution cache, we persist failed executionsYes, this is the cause of the major problem breaking CI. The solutions are to either stop caching failures or to implement a formal retry mechanism as described in this thread’s first post
aloof-angle-91616
09/14/2019, 6:47 PMaloof-angle-91616
09/14/2019, 6:50 PM@rule(A, [...]):
def f(...):
def make_uncached_exe_req(index):
return ExecuteProcessRequest(
argv=(...),
env={'_CACHE_KEY_ENV_VAR': str(index)},
...)
for i in range(0, 10):
res = yield Get(FallibleExecuteProcessResult, ExecuteProcessRequest, make_uncached_exe_req(i))
# do something with `res`...
aloof-angle-91616
09/14/2019, 6:51 PM