OK, just some jolly benchmarks moving from "serial...
# general
e
OK, just some jolly benchmarks moving from "serially building all docker containers with a build script" to "parallel builds using pants": we went from 28.8 minutes (1730 sec) to 611.88 seconds, about a 2.8x improvement (and there's one spectacularly long-build container that's responsible for the long tail; most were built much faster so if it weren't for that laggard this would look much better). And that's not even counting the impressive dependency checking, etc that goes with a proper build system. Very pleased with this; going to be working on moving it to production. Thanks for the fantastic support!
💯 4
❤️ 5
🙌 4
h
Amazing! Thanks for sharing and all the bug reports / feedback along the way! 😄
1
r
Would you mind telling a bit more about how does your codebase looks like? Why are you building so many docker images? Is it some Microservice architecture?
e
It is, although some of them are more "macroservices" because python dependencies can get a little heavyweight with data science and math packages (numpy and scipy are bad enough, but if a developer tries to include torch you're looking at gigabyte-plus packages). So we have an API gateway that drops tasks into a distributed task queue (celery) and a "dispatcher" that handles some serialization and dispatching tasks to hand them off to the right workers, and then some heavyweight workers to actually handle the tasks. Then there is a scheduler which just injects periodic cleanup tasks into the system, a "janitor" which handles some of the cleanup tasks when they're requested, a web service front-end, an admin front-end, a little websocket dingus that handles small bits of plumbing... and by far most of the changes are in the 2-3 heavyweight workers, so our deployment system has been working too hard...