when we started with separate pexes per project, it was ok at first, but then we quickly started ddos-ing our docker registry with similar-but-not-quite docker images, because all of them had a different pex with slightly different set of 3rdparty dependencies. So we now just build one base image with all 3rdparty dependencies for most of the projects to reuse docker cache even though some of the 3rdparty dependencies are not used everywhere
we also have pyspark that is huge, so we fetch it separately and install via tar.gz and exclude from from pex build (which is very slow btw). So yeah, we have all binaries huge 😢