How do people manage dependencies not ballooning o...
# general
p
How do people manage dependencies not ballooning out of control and making all binaries huge?
c
In a large repository there usually needs to be some amount of gardening or boring but beneficiary "how are we going to organize files in directories" discussions. You can enforce strict conventions with visibility rules: https://www.pantsbuild.org/blog/2023/04/25/visibility-feature-in-pants-2-16
w
YES! Pants is almost a detriment in some way to this (not the large binaries, but the lack of organization) because it'll just look anywhere, find stuff, and auto-magically grab it. With all of my clients, I've had to remind them that, just because pants CAN do it doesnt mean you SHOULD do it
b
you could add some safeguards in the form of a unit test or CI check on binary size, similar to code coverage checks. Should help guard against cases where you inadvertently pick up a dependency with a huge transitive impact.
s
when we started with separate pexes per project, it was ok at first, but then we quickly started ddos-ing our docker registry with similar-but-not-quite docker images, because all of them had a different pex with slightly different set of 3rdparty dependencies. So we now just build one base image with all 3rdparty dependencies for most of the projects to reuse docker cache even though some of the 3rdparty dependencies are not used everywhere we also have pyspark that is huge, so we fetch it separately and install via tar.gz and exclude from from pex build (which is very slow btw). So yeah, we have all binaries huge 😢