Results from memory investigation: <https://gist.g...
# development
h
Results from memory investigation: https://gist.github.com/Eric-Arellano/defca7d864a9f3939448964beea618d4 tl;dr: - Found results by inspecting
ExternContext._handles()
after all console rules are finished. -
./pants list
scales so well because its memory usage is
O(1)
. -
./pants test
is
O(t + t*e + b)
, where
t
is # of targets,
e
is #
env
values used in the rule, and
n
is # of files in the transitive closure. - We materialize the build files for each package, with each file about 1000 - 5000 bytes - Constructing env vars is
O(e * t)
. Here, we construct
PATH
twice per target / rule invoke, resulting in 600 bytes per target - test result stdout is
O(t)
, about 500 bytes per target
a
it sounds like the materialization and process execution is parallelizable though?
h
Yes, but the issue is running out of memory. We’re not worried about processor parallelism here. These are the results after all the console rules are run, meaning that running more targets -> more memory used, until eventually the OS runs out of memory when you run with too many targets like
ci.sh
does
a
ah of course
can we bump the memory limit?
h
can we bump the memory limit?
Maybe? But I think the far better solution is reducing the space complexity to get as close to constant as possible. If
./pants test
crashes Travis on Pants unit tests, imagine trying to run it on Twitter’s entire test suite.
a
sure, but if it doesn't leak and pantsd is alive, then all that memory might actually be pretty useful?
h
all that memory might actually be pretty useful?
I don’t think so. Look at https://gist.github.com/Eric-Arellano/defca7d864a9f3939448964beea618d4. I don’t think it’s necessary for us to actually materialize the files to Python, nor is it ideal that we keep serializing the env var
PATH
twice for every target. Serializing
TestResult.stdout
is probably fine for now though.
a
oh ok
describing it in order notation kinda confused me. i suppose it's appropriate for understanding scale for a specific use case
forget what i'm saying it makes sense