https://pantsbuild.org/ logo
a

aloof-angle-91616

02/04/2019, 7:02 PM
dark horse here is also local process execution caching -- if we know a process execution is splittable we can cache it at a finer granularity
a

average-vr-56795

02/04/2019, 7:03 PM
Except it also makes caching harder to reliably do…
It’s non-obvious that
foo x y
can be cached as
foo x
and
foo y
, and vice versa
a

aloof-angle-91616

02/04/2019, 7:04 PM
hence formalizing "splittable"
but like that's exactly what i was thinking
a

average-vr-56795

02/04/2019, 7:04 PM
That’s putting an awful lot of trust in rule-authors to get their caching semantics correct…
w

witty-crayon-22786

02/04/2019, 7:04 PM
it can certainly be done manually... not sure it's worth making generic
but i think it might be orthogonal to local caching
a

aloof-angle-91616

02/04/2019, 7:05 PM
no it is not trusting anyone it is making an interface to execute processes which allows us to uphold these guarantees
a

average-vr-56795

02/04/2019, 7:05 PM
Unless done super carefully, it completely breaks caching…
a

aloof-angle-91616

02/04/2019, 7:06 PM
users would not use the
SplittablyCacheableExecuteProcessRequest
unless this is the logic that applies to their use case
w

witty-crayon-22786

02/04/2019, 7:06 PM
@average-vr-56795: as long as you partition deterministically, it shouldn't break caching
a

average-vr-56795

02/04/2019, 7:06 PM
What if
x
and
y
aren’t truly independent? Or what if you incorrectly decompose the output tree so that running
x
is actually necessary to produce the output of running
y
, and in testing they happen to run together (or in the right order, or whatever)?
a

aloof-angle-91616

02/04/2019, 7:06 PM
deterministically shouldn't be required
a

average-vr-56795

02/04/2019, 7:06 PM
@witty-crayon-22786 Sure, but @aloof-angle-91616 is talking about dynamically partitioning…
w

witty-crayon-22786

02/04/2019, 7:07 PM
but @aloof-angle-91616: splitting will primarily be beneficial when the overhead of each new task is low... so without nailgun/native-image, it would likely increase overhead
a

aloof-angle-91616

02/04/2019, 7:07 PM
well then it would break and the rule author would realize their mistake?
a

average-vr-56795

02/04/2019, 7:07 PM
It may only break a year later, when someone runs on a sufficiently fast or slow or loaded or whatever machine
a

aloof-angle-91616

02/04/2019, 7:07 PM
@witty-crayon-22786 hence the graal PR which works and as noted here has a great first step towards doing this https://pantsbuild.slack.com/archives/C0D7TNJHL/p1549269379018000
w

witty-crayon-22786

02/04/2019, 7:08 PM
@aloof-angle-91616: right. i'm just referring to priorities
a

aloof-angle-91616

02/04/2019, 7:08 PM
i think the priority should be to make graal native-image work with zinc but nobody listens to me
@average-vr-56795 i don't understand why we assume people are using this experimental very explicit interface for splitting their process executions for process executions that don't work like that
"it may only break a year later" is scary and is true of a lot of things we work on
a

average-vr-56795

02/04/2019, 7:10 PM
Because people don’t fully understand the entire world, and I’m trying to guide people in ways that are hard to make mistakes with…
a

aloof-angle-91616

02/04/2019, 7:14 PM
i don't see a set of circumstances that would allow us to ever introduce something that would make scalafmt go fast given the argument you are applying, and i would like to make scalafmt go fast (so perhaps we should recall that scalafmt is exactly the kind of thing that can be paralleled and cached per-file, recalling my message to @witty-crayon-22786 above). if your concern is that it is too general then we don't have to rush to shove it into
NailgunTask
. this is already going to be this separate
self.runjava_split()
method that nobody is using yet and i find it hard to see how it's easy to make mistakes here
a

average-vr-56795

02/04/2019, 7:16 PM
I'm certainly happy to play with it in the context of something we know works well, but I think we should do that special cased in scalafmt, and think very very carefully about how, if at all, we generalise it
a

aloof-angle-91616

02/04/2019, 7:20 PM
ok, this was my approach in the graal PR anyway. i do not have the experience to inform me of why this care is necessary. as i said when i introduced this, i am thinking of this "dynamic scheduling" (not referring to racing) as specifically the way we can start mixing and matching execution strategies (and more easily benchmarking them) to maximize rsc compilation speed, because there's not a clear winner right now, so thinking very carefully is fine, but i was interpreting that as saying enabling this sort of execution at all is scary and bad and that was confusing
i have absolutely no desire to provide an API right now for anyone else to make #fast #fun and #freely cacheable/splittable tasks beyond myself writing those tasks (of which rsc compile is the specific use case)
a

average-vr-56795

02/04/2019, 7:23 PM
Awesome 🙂 Sounds like we’re well aligned!