Hi everyone I m evaluating migrating a monorepo to pants So Pants #general

Hi everyone, I’m evaluating migrating a monorepo t...

dazzling-elephant-33766

01/06/2023, 12:32 PM

Hi everyone, I’m evaluating migrating a monorepo to pants. So far I’ve managed to get everything building and it’s looking great. A lot of services were written under the assumption a single DB connection would be kept/shared amongst all the tests for that service (I’m sure this is not uncommon in python webdev) i.e.

Copy code

service/a/test/test_a.py
service/a/test/test_b.py 
service/a/test/test_c.py

Will re-use the same DB. Since pants runs each pytest in a separate process I had to create a unique DB name for each and run any initial migrations xN times. This overhead unfortunately makes the tests slower than they were before, even with parallel execution. Is there’s any mechanism to force pytest to run tests in a single process? Passing

--debug

still appears to run the DB setup/teardown once per test file. Whereas ideally I’d like it to run once for the whole suite.

wide-midnight-78598

01/06/2023, 12:52 PM

Have you had a chance to try this? https://www.pantsbuild.org/docs/troubleshooting#controlling-test-parallelism

dazzling-elephant-33766

01/06/2023, 12:53 PM

I think this is exactly what I’m looking for ☝️

wide-midnight-78598

01/06/2023, 12:57 PM

🙂

sparse-lifeguard-95737

01/06/2023, 1:20 PM

@dazzling-elephant-33766 that scenario is the exact use-case that motivated the “pytest batching” feature added in Pants 2.15: https://www.pantsbuild.org/v2.15/docs/python-test-goal#batching-tests

sparse-lifeguard-95737

01/06/2023, 1:21 PM

2.15 is still in RC but my company has been using it for months without issues FWIW. there were a lot of plugin API changes that might make upgrading painful, though

happy-kitchen-89482

01/06/2023, 2:13 PM

PS You can have the best of both worlds by running multiple concurrent batches, and using execution_slot_var (https://www.pantsbuild.org/docs/reference-pytest#section-execution-slot-var) to name a database that is unique to the concurrency slot

➕ 1

plain-night-51324

01/09/2023, 6:55 AM

i upgraded to 2.15 and started using an unique batch_compatibility_tag for tests under a directory. but I still have test failures for db table conflicts when running

./pants test ::

but each test passes when I run them individually like

./pants test file/to/tests/foo.py

Have not tried execution_slot_var yet.

plain-night-51324

01/09/2023, 7:04 AM

I guess it’s not clear to me whether tests running in the same pytest process is good or not. Initially this field is not set for all tests. so according to the docs “they are run in a dedicated

pytest

process”. Does that mean when I run

./pants test ::

since they run in dedicated processes they will be parallelized and result in db conflicts? So basically the best case scenario is if I set them to the same batch_compatibility_tag for all my tests, but even then it isn’t guaranteed they run in the same process tests.

dazzling-elephant-33766

01/09/2023, 10:17 AM

A combination of batching +

execution_slot_var

is almost certainly what I’m looking for here. So I can run the db/init/migration setup once per service, but test each service against a different DB. For now I’d just like to get the batching working, and worry about execution slots + different DB’s later. I’m on

2.15.0rc1

and I’ve added the following to the

BUILD

file in the same dir as my

tests/conftest.py

as detailed in https://www.pantsbuild.org/v2.15/docs/python-test-goal#batching-tests

Copy code

python_test_utils(
    name="test_utils",
)

__defaults__({(python_test, python_tests): dict(batch_compatibility_tag="your-tag-here"),})

Are there additional steps required here? Pants still appears to be running tests in multiple processes, because half the suite is failing (as the db already exists)

psycopg2.errors.DuplicateDatabase: database "blah" already exists

I’m a bit wary of this snippet:

Compatible tests may not end up in the same
pytest
batch if:

• There are “too many” tests with the same
batch_compatibility_tag
, as determined by the
[test].batch_size
setting.

• Compatible tests have some incompatibility in Pants metadata (i.e. different
resolve
or
extra_env_vars
).

perhaps I’m hitting some sort of batch limit, but ideally I’d like this to be unbounded.

happy-kitchen-89482

01/09/2023, 10:44 AM

@plain-night-51324 that is exactly what execution_slot_var is designed to solve...

happy-kitchen-89482

01/09/2023, 10:45 AM

By default Pants runs one

pytest

process per test file, and those processes will run concurrently. If you use the test batching feature then each

pytest

process will run multiple test files, but they will still run concurrently.

happy-kitchen-89482

01/09/2023, 10:45 AM

And usually this is what you want, for performance.

happy-kitchen-89482

01/09/2023, 10:45 AM

execution_slot_var

is what lets you do that without the test colliding on database access

happy-kitchen-89482

01/09/2023, 10:47 AM

I think you can also set the batch size to be huge enough that in practice you get a single process, but that will harm your performance. You can claw some of it back by using

pytest-xdist

as the concurrency mechanism, but you'd still get no caching.

happy-kitchen-89482

01/09/2023, 10:49 AM

@dazzling-elephant-33766 So you can set the batch_size arbitrarily high I guess, and get literally a single pytest process for your entire repo? With the caveats above about performance

dazzling-elephant-33766

01/09/2023, 11:19 AM

Adding this in my

pants.toml

in the root of my repo hasn’t changed the behaviour. I’m still seeing what appears to be separate pytest processes for each test file. I’m invoking the tests with

./pants test services/a/tests:: -- -s

and observing pytest trying to create the same test DB over and over.

Copy code

[test]
batch_size = 1000

sparse-lifeguard-95737

01/09/2023, 2:39 PM

I’m traveling most of today but I can help debug tomorrow if you’re still hitting problems. At first glance I don’t see an obvious issue

dazzling-elephant-33766

01/09/2023, 2:41 PM

Many thanks for the offer Dan, I’ll continue playing around, no time pressure at all.

dazzling-elephant-33766

01/09/2023, 4:01 PM

So it looks like I needed to put the tag in

service/a/BUILD

instead of

service/a/test/BUILD

(which is where I have my

contest.py

test_abc.py

) files. Documentation confused me a tad. My test suite for that service is all passing now ✅

6 Views

Open in Slack

Previous Next