Hi all! What determines how many `requirements.pex...
# general
b
Hi all! What determines how many
requirements.pex
and
pytest_runner.pex
get built during
./pants test ::
? I have defined 9
python_tests
targets in all in our repository, yet I’m seeing over 26
pytest_runner.pex
and over 31
requirements.pex
built (also these numbers change depending on how I structure my requirements.txt files).
h
Hi @brave-furniture-86963! Pants runs each test file in its own process, with its own requirements. So if a
python_tests
target has multiple sources, each one will run separately.
This allows each test file to run with just its actual requirements, and not be invalidated when unrelated requirements change.
But (assuming there are no conflicting versions involved) Pants does one pip resolve, and then composes the the per-file requirements as a subset of that resolve
so you shouldn't be seeing 31 slow pip runs
you should see 1 slow pip run then 31 fast subset operations
b
Thanks Benjy. Trying to understand this through an example: If: • the whole application uses 20 dependencies overall, and • a test file is testing 1 function of that application, and • that function only needs 3 dependencies, Then: • the test file will be run in a process that has the 3 dependencies only ?
h
Correct! (plus the transitive dependencies of those 3 direct dependencies)
b
I see. So, the granularity is defined at the test file level, regardless of the how the test targets were set up ?
h
Exactly
pytest is run separately on each file
b
I read that this behavior can be disabled by the user to use the repository.pex for all test files. Can you please point me to how to disable it? Also, do you envision making the granularity configurable by the user ? maybe make it at the test target level instead of the test file level? That way the user has control on the level of granularity.
How are data dependencies treated ? do they get copied over for each test process separately ?
h
Re changing the execution granularity, we have no plans to do that at the moment. What would be the advantage of supporting that, in your case?
And re data dependencies, do you mean like files () and resources () targets?
b
Yes, our tests runtime went up 4x switching to pants. We have large data dependencies and we think they’re causing pants to take too long to run the tests, since it might be copying them over to each test process. If the granularity is user defined through test targets, then we can have 1 test target for all the tests using these heavy data dependencies.
h
Got it
@witty-crayon-22786 Any thoughts on this?
w
we’ve discussed having a setting to control the test granularity by batching multiple test files together into a run, and that might be a possibility here.
but first: @brave-furniture-86963: do you have a sense of how much time is spent resolving dependencies (building requirements.pex and pytest_runner) vs actually running tests?
and are you using a
constraints.txt
file as described here: https://www.pantsbuild.org/docs/python-third-party-dependencies#user-lockfile ?
b
We have pinned dependencies in our
requirements.txt
files already. Do we still need
constraints.txt
?
w
yes, unfortunately: as it stands, having a full transitive lockfile as described on that page is a significant performance boost. as mentioned there, we’re planning to fully automate the process in the next few weeks.
@brave-furniture-86963: but my first question is relevant in this case: if you’re spending a lot of time resolving, that would be the first thing to fix most likely. and using a lockfile helps with that.
b
Good to know. I’ll give the
constraints.txt
a try and report back. Thanks!
Re
do you have a sense of how much time is spent resolving dependencies (building requirements.pex and pytest_runner) vs actually running tests?
With no cache, building requirements.pex and pytest_runner.pex takes 36 minutes. Running tests takes 57 minutes. Using
constraints.txt
helped cut the time of building pytest_runner.pex by half (18min down from 36min). But when using
constraints.txt
, running tests fails (on a different test in both tries I did) with something like this
Copy code
Error expanding output globs: Failed to scan directory "/private/var/folders/ky/6q3_4s852_51v_3nm1f633b40000gq/T/process-executionfoFmYt/": No such file or directory (os error 2)
w
hm… which version of Pants is this?
b
2.7.0+git0f39c178
Note that the same branch passed in CI. The above failure happened on my laptop only.
w
interesting: the tag indicates that you’ve built Pants from source? that shouldn’t be relevant here, but.
did you change any other settings at the same time that you changed the constraints file? it shouldn’t really impact our capturing of the sandbox…
b
w
also, some more context on which process failed would be good: can you pass
--print-stacktrace
on your laptop, and include a bit more from above the error?
ah. got it, sorry.
b
Copy code
Engine traceback:
  in select
  in pants.core.goals.test.run_tests
  in pants.core.goals.test.enrich_test_result (hcm/optimize/test/test_partition_optimizer.py:../../tests)
  in pants.backend.python.goals.pytest_runner.run_python_test (hcm/optimize/test/test_partition_optimizer.py:../../tests)
  in pants.engine.process.remove_platform_information
  in multi_platform_process
Traceback (no traceback):
  <pants native internals>
Exception: Failed to execute: Process {
    argv: [
        "./pytest_runner.pex_pex_shim.sh",
        "--cov-report=",
        "--cov-config=.coveragerc",
        "--cov=.",
        "hcm/optimize/test/test_partition_optimizer.py",
    ],
    env: {
        "PEX_EXTRA_SYS_PATH": ".",
        "PYTEST_ADDOPTS": "--color=yes",
    },
    working_directory: None,
    input_files: Digest {
        hash: Fingerprint<a3b4161d915c63fccbb4e68f12c4974a6bf3419e18b116a07c563dcf2f5dd491>,
        size_bytes: 1155,
    },
    output_files: {
        RelativePath(
            ".coverage",
        ),
    },
    output_directories: {
        RelativePath(
            "extra-output",
        ),
    },
    timeout: None,
    execution_slot_variable: None,
    description: "Run Pytest for hcm/optimize/test/test_partition_optimizer.py:../../tests",
    level: Debug,
    append_only_caches: {
        CacheName(
            "pex_root",
        ): CacheDest(
            ".cache/pex_root",
        ),
    },
    jdk_home: None,
    platform_constraint: None,
    is_nailgunnable: false,
    cache_scope: Successful,
}

Error expanding output globs: Failed to scan directory "/private/var/folders/ky/6q3_4s852_51v_3nm1f633b40000gq/T/process-executiony9aBUA/": No such file or directory (os error 2)
w
that is very strange indeed… and this happens consistently with a variety of tests? i’m wondering whether the test did something to delete the directory it was running in (unlikely, but…).
another flag that might be useful would be `--no-process-execution-local-cleanup`… would allow us to inspect that sandbox to see whether there was anything peculiar left inside of it.
b
what would I look for in the sandbox ?
Copy code
cd /private/var/folders/ky/6q3_4s852_51v_3nm1f633b40000gq/T/process-executionC9iwWM/
➜  process-executionC9iwWM ll | wc -l
      14
Copy code
➜  process-executionC9iwWM ./pytest_runner.pex_pex_shim.sh my_test
.....
.....
=================================== 13 passed, 11 warnings in 2.36s =================================================================
w
Very strange. I can't think of any reason why it wouldn't be able to scan that. You might try temporarily setting
Copy code
[GLOBAL]
local_execution_root_dir
... to a different temporary directory, and see whether tests will pass in that location?