salmon-barista-63163
12/16/2020, 8:19 PMpants run
commands from within a single pants run
process without issue. With the v2 engine we are CPU and memory bound and the native OS will start killing random processes (as expected) when we reach the ceiling.
A little background on how this server works:
application server: main application process running with ./pants run
child server: Receives a request to start a "task" or series of "tasks" = multiple ./pants run sub-processes executed in parallel. These subprocesses then exit gracefully after they are completed.
The way this works, we can receive sub process commands to execute dozens of pants run
“task” commands at a single time.
Some things I have tried that do not have effect on performance:
1. pantsd on or off
2. Concurrency limits set in global pants options
I am looking for some guidance here as we have tried everything that comes to mind without changing how this application runs. (That is out of our scope at this time as the main concern here is getting to pants 2.0.0) We are currently on pants 1.30.witty-crayon-22786
12/16/2020, 8:22 PMwitty-crayon-22786
12/16/2020, 8:22 PMsalmon-barista-63163
12/16/2020, 8:23 PMwitty-crayon-22786
12/16/2020, 8:23 PMwitty-crayon-22786
12/16/2020, 8:24 PMwitty-crayon-22786
12/16/2020, 8:25 PM./pants run
should be roughly equivalent to ./pants package && dist/app.pex
… but the latter exits before running the application, which would free resources soonerwitty-crayon-22786
12/16/2020, 8:26 PM2.2.x
salmon-barista-63163
12/16/2020, 8:32 PMwitty-crayon-22786
12/16/2020, 8:35 PMIs there a difference in the package goal vs binary other than the syntax?no, different name for the same thing.
I am curious why we are seeing such a performance degradation with v2 engine vs v1 when executing this same flow.v1 was not parallel at all… so adding parallelism around it was probably reasonable. v2 is parallel, so you should be able to run
./pants package ::
(all the things) to build all binaries in parallel, but wrapping more parallelism around it is going to over-saturatewitty-crayon-22786
12/16/2020, 8:36 PM--process-execution-local-parallelism
only controls the parallelism of processes that we fork: it doesn’t control the number of threads that we’ll use to run @rules
witty-crayon-22786
12/16/2020, 8:38 PMwitty-crayon-22786
12/16/2020, 8:39 PMsalmon-barista-63163
12/16/2020, 8:41 PM@rules
may help here. CPU is our main concern at the moment as its the reason our host OS is shooting PIDs.
The way this server is architeted is it acts like a giant gateway per say. It waits for a “task server” to be kicked off and then launches a bunch of mini servers that mock out the flow of an AWS step function in our local and CI environments. We know we could optimize here for improvements on our end but at this time our goal is just to get to pants 2salmon-barista-63163
12/16/2020, 8:41 PMwitty-crayon-22786
12/16/2020, 8:41 PMDo you have a timeline on this so we could test it out?in the next month, most likely? if it’s your only blocking issue, we could look into bumping it up a bit.
salmon-barista-63163
12/16/2020, 8:42 PMwitty-crayon-22786
12/16/2020, 8:43 PMwitty-crayon-22786
12/16/2020, 8:44 PMCPU is our main concern at the moment as its the reason our host OS is shooting PIDs.interesting. are you sure…? i didn’t realize that that was a thing!
witty-crayon-22786
12/16/2020, 8:46 PMwitty-crayon-22786
12/16/2020, 8:46 PMenough-analyst-54434
12/16/2020, 8:47 PMsalmon-barista-63163
12/16/2020, 8:49 PMtop
on our linux boxes watching this application run while our tests ran. We observed the pants processes taking CPU to the ceiling and memory was pretty contained. I could verify again to be 1000% sure.salmon-barista-63163
12/16/2020, 8:49 PMwitty-crayon-22786
12/16/2020, 8:58 PMsalmon-barista-63163
12/16/2020, 9:00 PMwitty-crayon-22786
12/16/2020, 10:42 PMsalmon-barista-63163
12/17/2020, 12:02 AMwitty-crayon-22786
12/17/2020, 12:07 AMwitty-crayon-22786
12/17/2020, 12:08 AMsalmon-barista-63163
12/17/2020, 12:12 AMsalmon-barista-63163
12/17/2020, 12:12 AMwitty-crayon-22786
12/17/2020, 12:12 AMsalmon-barista-63163
12/17/2020, 12:12 AMwitty-crayon-22786
12/17/2020, 12:12 AMsalmon-barista-63163
12/17/2020, 12:13 AMwitty-crayon-22786
12/17/2020, 12:13 AMwitty-crayon-22786
12/17/2020, 12:37 AMsalmon-barista-63163
12/17/2020, 12:49 AMwitty-crayon-22786
12/17/2020, 12:50 AMwitty-crayon-22786
12/17/2020, 12:55 AMpants
script to try out the pre-release build: https://github.com/pantsbuild/setup/blob/69351495867bd555a76b4a523f816b66acb8506f/pants#L15-L18witty-crayon-22786
12/17/2020, 1:07 AMPANTS_SHA=372942e0f30763fff8b11a8a344fb6c36a6b23a0
. it’s branched from 2.0.1rc4
witty-crayon-22786
12/17/2020, 1:08 AMwitty-crayon-22786
12/17/2020, 3:12 AMsalmon-barista-63163
12/17/2020, 3:58 PMsalmon-barista-63163
12/17/2020, 3:59 PMenough-analyst-54434
12/17/2020, 4:21 PMwitty-crayon-22786
12/17/2020, 4:22 PMwitty-crayon-22786
12/17/2020, 6:14 PMIt looks like if you set it low enough you can deadlock... so would probably not go below 2.i’m adding a defense against this to the final patch.
salmon-barista-63163
12/17/2020, 8:37 PMsalmon-barista-63163
12/17/2020, 8:38 PMenough-analyst-54434
12/17/2020, 9:05 PMsalmon-barista-63163
12/17/2020, 9:26 PMwitty-crayon-22786
12/17/2020, 9:35 PMsalmon-barista-63163
12/17/2020, 10:38 PMenough-analyst-54434
12/17/2020, 10:42 PMit was CPU bound. OOM killer was not killing the processes.Huh, ok - thanks. At some point I'm sure we'll need to learn what subsystem or configuration implements this CPUKiller - we're bound to see it again.
witty-crayon-22786
12/17/2020, 11:23 PMsalmon-barista-63163
12/17/2020, 11:23 PMsalmon-barista-63163
12/17/2020, 11:24 PM