<@U06A03HV1> for batched inference. What's be best...
# development
b
@witty-crayon-22786 for batched inference. What's be best way to measure the real-world performance? I know there's concern around cache invalidation when one file gets touched. I'm willing to test it on our work repo.
h
Since we control the output of the inference process, it should be easy to split that output up and cache individually per-file. OTOH, if the hypothesis is that inference is so cheap that almost the entire time per file is process overhead, then over-invalidating may not matter?
b
it should be easy to split that output up and cache individually per-file.
There's two halves to inference single v batch: • Rule memoization: If we batch it, it'll invalidate the memoization. No easy way around that, but likely very fast (TM) • Process caching: That's where my other PR of splitting a coalesced process would help this So, even with the coalesced batching, memoization would still be affected
that inference is so cheap that almost the entire time per file is process overhead, then over-invalidating may not matter
That's precisely my hypothesis, which I'm hungry to test in a real-world scenario so that I can push the PR forward with good findings
w
the most important usecases are probably: 1. disk caching with only a few files changed (the CI case when a cache is in use) 2. cold (the CI case when no cache is in use) 3. memoization with exactly one file changed (desktop iteration)