I opened a WIP draft PR for RFC on what I m calling Coalesce Pants #development

I opened a WIP draft PR for RFC on (what I'm calli...

bitter-ability-32190

05/25/2022, 3:31 PM

I opened a WIP draft PR for RFC on (what I'm calling) "Coalesced Process Batching": https://github.com/pantsbuild/pants/pull/15648 Looking for feedback on: • Naming. Naming is hard 😭 • Overall approach, specifically which data belongs in

SandboxInfo

vs.

CoalescedProcessBatch

• Feasibility. Right now I'm confident this could work well for formatters and linters. because this really only works on "successful" process runs where it's OK to throw away stdout/stderr. ◦ For it to work with, say, dependency inference, we'd have to not throw away stdout/stderr. Or maybe get clever with output files (output the result JSON to a file with a unique name, then collect them)? • Performance: How bad will this hurt the vanilla code, which now makes several

MultiGet

requests for little SandboxInfos. only be merged later. • Thoughts on reducing boilerplate code in the engine between the new type and existing Process

😮 1

wide-midnight-78598

05/25/2022, 3:57 PM

The ultimate goal is to have the cache be populated per-file when running a process for maximum cacheability, but still run processes on batches of files for performance.

How does it work today re: caching? Are they batch cached?

bitter-ability-32190

05/25/2022, 3:58 PM

Right now caching is done 1:1 with the process we run. If we run a process using a digest with 200 files, the cache key is comprised of those 200 files. And a single cache entry is inserted

wide-midnight-78598

05/25/2022, 3:59 PM

Ah, okay, and then if any of those 200 files are invalidated or if dependency inference determines one of those 200 files has been invalidated - all 200 run?

bitter-ability-32190

05/25/2022, 3:59 PM

In a nutshell, yeah.

wide-midnight-78598

05/25/2022, 4:00 PM

Okay, that's what I thought - just wanted to confirm . Thanks!

🙌 1

✅ 1

wide-midnight-78598

05/25/2022, 4:01 PM

Would this also eventually support tests?

wide-midnight-78598

05/25/2022, 4:02 PM

And add-on to that, would this be affected by sharding?

bitter-ability-32190

05/25/2022, 4:09 PM

I can't ever seeing tests working with this behavior. The challenge is that when we run the batched process, it will have stdour/stderr for the batch. It's technically infeasible to split that into output per-file. We can live without stdout/stderr for formatters/linters, but for tests it just is too valuable (see

--output=all

on the test goal)

👍 1

Open in Slack

Previous Next