Results from memory investigation <https gist github com Eri Pants #development

Results from memory investigation: <https://gist.g...

hundreds-father-404

05/16/2019, 3:28 AM

Results from memory investigation: https://gist.github.com/Eric-Arellano/defca7d864a9f3939448964beea618d4 tl;dr: - Found results by inspecting

ExternContext._handles()

after all console rules are finished. -

./pants list

scales so well because its memory usage is

O(1)

. -

./pants test

O(t + t*e + b)

, where

is # of targets,

is #

env

values used in the rule, and

is # of files in the transitive closure. - We materialize the build files for each package, with each file about 1000 - 5000 bytes - Constructing env vars is

O(e * t)

. Here, we construct

PATH

twice per target / rule invoke, resulting in 600 bytes per target - test result stdout is

O(t)

, about 500 bytes per target

aloof-angle-91616

05/16/2019, 3:56 AM

it sounds like the materialization and process execution is parallelizable though?

hundreds-father-404

05/16/2019, 3:57 AM

Yes, but the issue is running out of memory. We’re not worried about processor parallelism here. These are the results after all the console rules are run, meaning that running more targets -> more memory used, until eventually the OS runs out of memory when you run with too many targets like

ci.sh

does

aloof-angle-91616

05/16/2019, 3:58 AM

ah of course

aloof-angle-91616

05/16/2019, 3:58 AM

can we bump the memory limit?

hundreds-father-404

05/16/2019, 4:00 AM

can we bump the memory limit?

Maybe? But I think the far better solution is reducing the space complexity to get as close to constant as possible. If

./pants test

crashes Travis on Pants unit tests, imagine trying to run it on Twitter’s entire test suite.

aloof-angle-91616

05/16/2019, 4:01 AM

sure, but if it doesn't leak and pantsd is alive, then all that memory might actually be pretty useful?

hundreds-father-404

05/16/2019, 4:03 AM

all that memory might actually be pretty useful?

I don’t think so. Look at https://gist.github.com/Eric-Arellano/defca7d864a9f3939448964beea618d4. I don’t think it’s necessary for us to actually materialize the files to Python, nor is it ideal that we keep serializing the env var

PATH

twice for every target. Serializing

TestResult.stdout

is probably fine for now though.

aloof-angle-91616

05/16/2019, 4:07 AM

oh ok

aloof-angle-91616

05/16/2019, 4:08 AM

describing it in order notation kinda confused me. i suppose it's appropriate for understanding scale for a specific use case

aloof-angle-91616

05/16/2019, 4:09 AM

forget what i'm saying it makes sense

Open in Slack

Previous Next