What do all your CI pipelines look like with pants...
# general
e
What do all your CI pipelines look like with pants? I've currently got it set up to use things like gitlab's parallel matrix to run per-module unit tests in separate jobs, build each docker container in a separate job, etc. This mimics the previous behavior that we had before adopting pants, though it seems a bit silly to invoke job overhead, and run setup code so many times given that pants makes it so easy to just have a "unit test job" for everything and a "package job `pants package ::`" for everything. My main concern with switching to something like this is that it would dump too much output log together and it would be difficult to look at eg. a list of failing jobs and figure out where to investigate. Or that
package ::
would fail on first error and a developer would have to fix errors one by one, then wait on CI to see the next error. Curious what other thoughts are, and some insights from practical experience.
s
Are you using —changed-since ? We have a large number of docker images however most changes do not tend to touch more than one and since pants dependencies are so granular, even changes in shared libraries dont tend to build a large number Our gitlab CI job looks like pants lint —changed-since=$CI_COMMIT_BEFORE_SHA pants test publish —changed-since=$CI_COMMIT_BEFORE_SHA —changed-dependents=transitive Ie lint all files that change, test and publish anything that is a transitive change
e
We haven't been using
--changed-since
(based on the warning in https://www.pantsbuild.org/2.21/docs/using-pants/using-pants-in-ci#approach-1-only-run-over-changed-files) and on prior history. Our monorepo isn't THAT big that linting everything in one go is too prohibitive. Some questions I have about
--changed-since
though: How would this affect coverage reporting on tests. It seems you wouldn't be able to get an accuracte coverage report this way
g
This is quite a common question, so I'd search around a bit as well -- especially if you have a specific provider (GHA, BK, etc) in mind. https://github.com/EmbarkStudios/emote/blob/main/.buildkite/pipeline.yml this is a copy of what we use internally, with some minor changes - the trigger at the end there is normally a publish step, f.ex. Persistent CI runners on GKE, local cache per CI runner + remote shared cache (~20 CI runners total that read/write). The only place where we use
--changed-since
is when we publish containers. We unconditionally package them, though. You can see I do some manual sharding based on resolve for package, since we parametrize our whole codebase on resolves. For hot caches it doesn't matter much, but on full cache invalidations it does shave a few minutes. Looking at some timings, our CI is <10 minutes minute E2E including publishing container images (one being 10G ML bullshit), 10-15 with cold local caches, and 15-20 minutes for a full cache invalidation. This is a fairly typical view when I look at CI timings for our main Pants repo.