Hello all! My company is exploring using `pants` f...
# general
e
Hello all! My company is exploring using
pants
for our monorepo (we've heard many tales of companies who spent years migrating to bazel and then years migrating off of it....). Our repo isn't huge, but I'm exploring what a migration would look like and hoping there might be some prior art here. We currently use pip-compile-multi (pip tools under the hood) to generate requirements.txt files for ~5-10 requirements.in files and enforce one version per third party library across our repo. Has anyone written up migration guides, or have tips and tricks about how we might mix and match during the migration so we don't need to try to merge in one massive PR?
I actually made a lot of progress, but a few snags for incremental migration: 1. pants doesn't seem to work with
-r ../foo/requirements.txt
in requirements files. This would make migrating from pip tools easier for sure. For now I can workaround by doing a
cat reqs1.txt reqs2.txt > requirements.txt
2. Migrating test targets piecemeal. I really want to use
--changed-since
to not run every single test target, but I can't go through and get every single test target in the whole repo to pass in one PR. I wanted to do
--changed_since=HEAD subdirectory/with/passing/tests::
but that fails because you can't use globs and --changed_since. I haven't found a workaround here yet. Is there a way perhaps to pipe output from
pants changed --since=HEAD list | ...
to apply a filter? I also imagine this is needed functionality when running pants on a large monorepo where you don't want to run all affected tests in the whole repo in every CI/CD flow
👍 1
I think
pants --changed-since=origin/dev list | grep '^backend' | xargs pants test
works, but it definitely feels a bit janky. I'd love to be able to use glob syntax
Alright, my last big remaining snag is figuring out how to get that
pants test
to work within a docker container with
--network=None
. We run our tests from within a dockerfile so we can setup some extra dependencies (postgres, playwright), and --network=None guarantees that the tests themselves are hermetic and don't depend on the network. I'm thinking we can probably get the target list during the build stage, but I couldn't find a command to build all the test targets in advance, so that
pants test
doesn't need any network access because the
.pex
targets already exist.
pants package
doesn't seem to work on python_test targets. Any ideas?
b
I think you've solved this part acceptably, but one option to consider for tests might be using the
--filter-...
options: https://www.pantsbuild.org/prerelease/docs/using-pants/advanced-target-selection, e.g. • apply either positive or negative tags to the relevant tests and select with
--filter-tag-regex=migrated
or deselect them with
--filter-tag-regex=-not_yet_migrated
• use
--filter-address-regex=subdirectory/with/passing/tests/.*
(or similar) for the dir-based globbing
I'm thinking we can probably get the target list during the build stage, but I couldn't find a command to build all the test targets in advance, so that
pants test
doesn't need any network access because the
.pex
targets already exist.
pants package
doesn't seem to work on python_test targets. Any ideas?
I think there's two broad options: • docker container -> pants -> python tests (i.e. run pants inside the container) • pants -> docker container -> python tests (i.e. have pants run "bare metal" and coordinate a docker container) The second one is possible with "environments": https://www.pantsbuild.org/prerelease/docs/using-pants/environments... but it might not expose the required options to disable networking 🤔 so that'd be a feature request/improvement to pants. But, if you're willing to tolerate that hermeticity-hole, that might be a path forward for now?
Re building test targets in advance... I'm not sure. I can think of some vague hacks like running
pants test :: -- --some-flag-that-makes-the-actual-execution-really-fast
, so the tests are built and "executed", but that doesn't feel great.
e
Thanks for the response Huon! I'll take a look into environments, but is there a good way to combine --changed-since with building the docker container?
That is, if we were to pursue option #2 to have pants build the docker container where the tests can run hermetically (pending the feature request), how would we build a docker container with only the tests we want to run?
b
(Maybe the flag that'd skip any actual test execution could be pytest's
--collect-only
)
Ah, if you use different images for different tests, one can configure multiple
docker_environments
and then have each test carefully specify the appropriate environment. Then, pants would spin up container(s) using the appropriate images, when running the tests in question.
(As in, if test A requires image X, and test B requires image Y, a Pants run that needs to execute A would start a container from X, and a Pants run that needs to execute A & B would start two containers, one from X and one from Y.)
I'm not sure if that's answering your question?
e
Okay I think I just didn't understand environments, and I was assuming this had to do with the python/docker integration but it's quite a bit lower level than that. That looks like it would work, with the downside of it seems like we'd have to define the environment on a test-by-test basis (which functionally is basically all of our tests). I wonder if there's a way to set a default environment, or override it for all targets on a particular
pants test
run?
b
Yeah, you can set a default one and then override for specific tests: https://www.pantsbuild.org/prerelease/docs/using-pants/environments#setting-the-environment-on-many-targets-at-once There'd be ways to do adhoc/global overrides and/or make the docker environments conditional too, if that's required.
e
Thank you! Is it possible to accommodate this flow? During local development, our developers run everything in a container that already has all the needed dependencies, so it's faster/easier to default to local execution of tests, but on CI/CD we'd want everything to run on the docker container. Basically, the definition of environment wouldn't be at the target level, but at the execution level
b
Some ideas: • tee up something using "environment matching": https://www.pantsbuild.org/prerelease/docs/using-pants/environments#environment-matching • make the environment definition for a given name conditional on being in CI or not, e.g.
Copy code
if env("CI", "0") == "1":
   docker_environment(name="testing", ...)
else:
   local_environment(name="testing", ...)
(not sure either of them are good or applicable ideas, though 🤔 )
e
Hahaha thanks for the ideas! I think for the immediate future I'm going to just sacrifice the network hermeticity and do docker -> pants -> tests, but I'll play around with these to try to get it back. If we end up going pants -> docker, it might not be the worst to standardize the execution environment of the test, in theory it should only have to build once and then be cached
BTW, really good experience for me onboarding onto pants today. I got a sizable portion of our repo over in one day. My biggest challenges were python_requirements not supporting -r syntax and the following: If a single python source needs a dynamic dependency because of a lack of import, you either need to add the dependency to the whole python_sources, or declare a new python_source. No problem, except the python_sources doesn't automatically exclude targets already owned by another source. No sweat, just add sources=["!my_file.py"] to the python_sources, but then it doesn't include anything because it doesn't merge with the defaults. So in the end you need to add
python_sources(sources=["!my_source.py", "*.py", "!test_*.py", "!*_test.py"])
But all in all, pretty happy with what I could accomplish in a day (coming from a background in Blaze)
🎉 2
b
awesome! glad to hear it. Agree about that
sources
papercut being a pain. You may be interested in `overrides`:
python_sources(..., overrides={"my_source.py": dict(dependencies=[":something_extra"], other_field_too=True, ...))
will set those fields specifically for
my_source.py
without needing the separate target (https://www.pantsbuild.org/stable/docs/using-pants/key-concepts/targets-and-build-files#target-generation briefly mentions it, and https://www.pantsbuild.org/dev/reference/targets/python_sources#overrides has some more discussion too)
❤️ 1