dark horse here is also local process execution ca...
# development
a
dark horse here is also local process execution caching -- if we know a process execution is splittable we can cache it at a finer granularity
a
Except it also makes caching harder to reliably do…
It’s non-obvious that
foo x y
can be cached as
foo x
and
foo y
, and vice versa
a
hence formalizing "splittable"
but like that's exactly what i was thinking
a
That’s putting an awful lot of trust in rule-authors to get their caching semantics correct…
w
it can certainly be done manually... not sure it's worth making generic
but i think it might be orthogonal to local caching
a
no it is not trusting anyone it is making an interface to execute processes which allows us to uphold these guarantees
a
Unless done super carefully, it completely breaks caching…
a
users would not use the
SplittablyCacheableExecuteProcessRequest
unless this is the logic that applies to their use case
w
@average-vr-56795: as long as you partition deterministically, it shouldn't break caching
a
What if
x
and
y
aren’t truly independent? Or what if you incorrectly decompose the output tree so that running
x
is actually necessary to produce the output of running
y
, and in testing they happen to run together (or in the right order, or whatever)?
a
deterministically shouldn't be required
a
@witty-crayon-22786 Sure, but @aloof-angle-91616 is talking about dynamically partitioning…
w
but @aloof-angle-91616: splitting will primarily be beneficial when the overhead of each new task is low... so without nailgun/native-image, it would likely increase overhead
a
well then it would break and the rule author would realize their mistake?
a
It may only break a year later, when someone runs on a sufficiently fast or slow or loaded or whatever machine
a
@witty-crayon-22786 hence the graal PR which works and as noted here has a great first step towards doing this https://pantsbuild.slack.com/archives/C0D7TNJHL/p1549269379018000
w
@aloof-angle-91616: right. i'm just referring to priorities
a
i think the priority should be to make graal native-image work with zinc but nobody listens to me
@average-vr-56795 i don't understand why we assume people are using this experimental very explicit interface for splitting their process executions for process executions that don't work like that
"it may only break a year later" is scary and is true of a lot of things we work on
a
Because people don’t fully understand the entire world, and I’m trying to guide people in ways that are hard to make mistakes with…
a
i don't see a set of circumstances that would allow us to ever introduce something that would make scalafmt go fast given the argument you are applying, and i would like to make scalafmt go fast (so perhaps we should recall that scalafmt is exactly the kind of thing that can be paralleled and cached per-file, recalling my message to @witty-crayon-22786 above). if your concern is that it is too general then we don't have to rush to shove it into
NailgunTask
. this is already going to be this separate
self.runjava_split()
method that nobody is using yet and i find it hard to see how it's easy to make mistakes here
a
I'm certainly happy to play with it in the context of something we know works well, but I think we should do that special cased in scalafmt, and think very very carefully about how, if at all, we generalise it
a
ok, this was my approach in the graal PR anyway. i do not have the experience to inform me of why this care is necessary. as i said when i introduced this, i am thinking of this "dynamic scheduling" (not referring to racing) as specifically the way we can start mixing and matching execution strategies (and more easily benchmarking them) to maximize rsc compilation speed, because there's not a clear winner right now, so thinking very carefully is fine, but i was interpreting that as saying enabling this sort of execution at all is scary and bad and that was confusing
i have absolutely no desire to provide an API right now for anyone else to make #fast #fun and #freely cacheable/splittable tasks beyond myself writing those tasks (of which rsc compile is the specific use case)
a
Awesome 🙂 Sounds like we’re well aligned!