dark horse here is also local process execution caching if w Pants #development

dark horse here is also local process execution ca...

aloof-angle-91616

02/04/2019, 7:02 PM

dark horse here is also local process execution caching -- if we know a process execution is splittable we can cache it at a finer granularity

average-vr-56795

02/04/2019, 7:03 PM

Except it also makes caching harder to reliably do…

average-vr-56795

02/04/2019, 7:04 PM

It’s non-obvious that

foo x y

can be cached as

foo x

and

foo y

, and vice versa

aloof-angle-91616

02/04/2019, 7:04 PM

hence formalizing "splittable"

aloof-angle-91616

02/04/2019, 7:04 PM

but like that's exactly what i was thinking

average-vr-56795

02/04/2019, 7:04 PM

That’s putting an awful lot of trust in rule-authors to get their caching semantics correct…

witty-crayon-22786

02/04/2019, 7:04 PM

it can certainly be done manually... not sure it's worth making generic

witty-crayon-22786

02/04/2019, 7:05 PM

but i think it might be orthogonal to local caching

aloof-angle-91616

02/04/2019, 7:05 PM

no it is not trusting anyone it is making an interface to execute processes which allows us to uphold these guarantees

average-vr-56795

02/04/2019, 7:05 PM

Unless done super carefully, it completely breaks caching…

aloof-angle-91616

02/04/2019, 7:06 PM

users would not use the

SplittablyCacheableExecuteProcessRequest

unless this is the logic that applies to their use case

witty-crayon-22786

02/04/2019, 7:06 PM

@average-vr-56795: as long as you partition deterministically, it shouldn't break caching

average-vr-56795

02/04/2019, 7:06 PM

What if

and

aren’t truly independent? Or what if you incorrectly decompose the output tree so that running

is actually necessary to produce the output of running

, and in testing they happen to run together (or in the right order, or whatever)?

aloof-angle-91616

02/04/2019, 7:06 PM

deterministically shouldn't be required

average-vr-56795

02/04/2019, 7:06 PM

@witty-crayon-22786 Sure, but @aloof-angle-91616 is talking about dynamically partitioning…

witty-crayon-22786

02/04/2019, 7:07 PM

but @aloof-angle-91616: splitting will primarily be beneficial when the overhead of each new task is low... so without nailgun/native-image, it would likely increase overhead

aloof-angle-91616

02/04/2019, 7:07 PM

well then it would break and the rule author would realize their mistake?

average-vr-56795

02/04/2019, 7:07 PM

It may only break a year later, when someone runs on a sufficiently fast or slow or loaded or whatever machine

aloof-angle-91616

02/04/2019, 7:07 PM

@witty-crayon-22786 hence the graal PR which works and as noted here has a great first step towards doing this https://pantsbuild.slack.com/archives/C0D7TNJHL/p1549269379018000

witty-crayon-22786

02/04/2019, 7:08 PM

@aloof-angle-91616: right. i'm just referring to priorities

aloof-angle-91616

02/04/2019, 7:08 PM

i think the priority should be to make graal native-image work with zinc but nobody listens to me

aloof-angle-91616

02/04/2019, 7:09 PM

@average-vr-56795 i don't understand why we assume people are using this experimental very explicit interface for splitting their process executions for process executions that don't work like that

aloof-angle-91616

02/04/2019, 7:10 PM

"it may only break a year later" is scary and is true of a lot of things we work on

average-vr-56795

02/04/2019, 7:10 PM

Because people don’t fully understand the entire world, and I’m trying to guide people in ways that are hard to make mistakes with…

aloof-angle-91616

02/04/2019, 7:14 PM

i don't see a set of circumstances that would allow us to ever introduce something that would make scalafmt go fast given the argument you are applying, and i would like to make scalafmt go fast (so perhaps we should recall that scalafmt is exactly the kind of thing that can be paralleled and cached per-file, recalling my message to @witty-crayon-22786 above). if your concern is that it is too general then we don't have to rush to shove it into

NailgunTask

. this is already going to be this separate

self.runjava_split()

method that nobody is using yet and i find it hard to see how it's easy to make mistakes here

average-vr-56795

02/04/2019, 7:16 PM

I'm certainly happy to play with it in the context of something we know works well, but I think we should do that special cased in scalafmt, and think very very carefully about how, if at all, we generalise it

aloof-angle-91616

02/04/2019, 7:20 PM

ok, this was my approach in the graal PR anyway. i do not have the experience to inform me of why this care is necessary. as i said when i introduced this, i am thinking of this "dynamic scheduling" (not referring to racing) as specifically the way we can start mixing and matching execution strategies (and more easily benchmarking them) to maximize rsc compilation speed, because there's not a clear winner right now, so thinking very carefully is fine, but i was interpreting that as saying enabling this sort of execution at all is scary and bad and that was confusing

aloof-angle-91616

02/04/2019, 7:22 PM

i have absolutely no desire to provide an API right now for anyone else to make #fast #fun and #freely cacheable/splittable tasks beyond myself writing those tasks (of which rsc compile is the specific use case)

average-vr-56795

02/04/2019, 7:23 PM

Awesome 🙂 Sounds like we’re well aligned!

Open in Slack

Previous Next