Hi! <@U051221NF> I got rid of all the warnings tha...
# general
a
Hi! @happy-kitchen-89482 I got rid of all the warnings that complain
constraints.txt
does not include all the requirements, and now we have a master set of constraints. but why do we still see so many pytest runner pexes being build? (see inside thread)
Copy code
15:08:41.57 [INFO] Completed: Extracting 10 requirements to build requirements.pex from repository.pex: apache-airflow-providers-amazon==2.1.0, apache-airflow==2.1.2, cachetools==4.1.1, future==0.18.2, pyyaml==5.4.1, setuptools<56... (99 characters truncated)
15:08:41.59 [INFO] Completed: Extracting 8 requirements to build requirements.pex from repository.pex: apache-airflow==2.1.2, future==0.18.2, pyyaml==5.4.1, setuptools<56.0,>=50.3.0, snowflake-sqlalchemy==1.2.3, sqlalchemy==1.3.24... (39 characters truncated)
15:08:41.61 [INFO] Completed: Extracting 11 requirements to build requirements.pex from repository.pex: apache-airflow==2.1.2, future==0.18.2, gspread==4.0.0, oauth2client==4.1.3, pandas<1.3.0,>=1.2.3, pyyaml==5.4.1, setuptools<56... (99 characters truncated)
15:08:41.76 [INFO] Completed: Extracting 17 requirements to build requirements.pex from repository.pex: apache-airflow-providers-amazon==2.1.0, apache-airflow==2.1.2, cachetools==4.1.1, future==0.18.2, pyodbc==4.0.30, pyyaml==5.4.... (244 characters truncated)
15:08:41.84 [INFO] Completed: Extracting 18 requirements to build requirements.pex from repository.pex: apache-airflow-providers-amazon==2.1.0, apache-airflow==2.1.2, cachetools==4.1.1, confluent-kafka==1.4.2, future==0.18.2, pyod... (268 characters truncated)
15:08:42.23 [INFO] Completed: Extracting 20 requirements to build requirements.pex from repository.pex: apache-airflow-providers-amazon==2.1.0, apache-airflow==2.1.2, cachetools==4.1.1, future==0.18.2, gspread==4.0.0, oauth2client... (303 characters truncated)
15:08:42.28 [INFO] Completed: Extracting 18 requirements to build requirements.pex from repository.pex: apache-airflow-providers-amazon==2.1.0, apache-airflow-providers-postgres==2.1.0, apache-airflow==2.1.2, cachetools==4.1.1, fu... (286 characters truncated)
15:08:42.28 [INFO] Completed: Extracting 12 requirements to build requirements.pex from repository.pex: apache-airflow==2.1.2, future==0.18.2, gspread==4.0.0, oauth2client==4.1.3, pandas<1.3.0,>=1.2.3, psycopg2-binary==2.9.1, pyya... (123 characters truncated)
15:08:43.77 [INFO] Completed: Extracting 21 requirements to build requirements.pex from repository.pex: apache-airflow-providers-amazon==2.1.0, apache-airflow==2.1.2, cachetools==4.1.1, confluent-kafka==1.4.2, distributed==2021.7.... (337 characters truncated)
15:08:53.73 [INFO] Completed: Extracting 13 requirements to build requirements.pex from repository.pex: dagit==0.12.8, dagster-aws==0.12.8, dagster-pyspark==0.12.8, dagster==0.12.8, grpcio<2.0,>=1.39.0, pandas<1.3.0,>=1.2.3, pyspa... (133 characters truncated)
15:08:53.78 [INFO] Completed: Extracting 13 requirements to build requirements.pex from repository.pex: dagit==0.12.8, dagster-aws==0.12.8, dagster-pyspark==0.12.8, dagster==0.12.8, grpcio<2.0,>=1.39.0, pandas<1.3.0,>=1.2.3, pyspa... (136 characters truncated)
Extracting 13 requirements to build requirements.pex from repository.pex
Would it be faster if repository.pex is just used for all these tasks?
h
Hey Horus, good question! We could do a better job explaining this in the docs I think. This blog talks about it under https://blog.pantsbuild.org/introducing-pants-2-5/ "Fine grained resolves without a performance hit". tl;dr: extracting the requirements should be pretty fast and its gets cached. The benefit is your cache key is much smaller, so changing an unrelated dependency won't invalidate all your tests that don't use it. And you can be confident your tests and especially binaries/packages only contain their "true" dependencies
h
Basically those
Extracting 13 requirements to build requirements.pex from repository.pex
should be really fast, because they're not running any pip resolves, they're just subsetting the full repository.pex
And the resulting subsets should be cached
although whether that helps you in CI depends on whether you're conserving the cache across CI runs
If you're not seeing that those are fast, then we should look into that
but also, we may want to offer an option to run directly against
repository.pex
as you suggest, for cases where that yields better performance.
I've suggested this recently
@witty-crayon-22786 ^
It may be that, especially in CI cases, or cases where large numbers of tests tend to invalidate a lot for other reasons anyway, subsetting is not worth it
and we should give users that option
w
2.7.x makes the situation significantly better: see https://github.com/pantsbuild/pants/pull/12675
as discussed there, directly using the
repository.pex
would result in inconsistencies between test time and binary build time. so if there were an option to use it, it would need to be disabled by default
šŸ‘ 1
ā˜ļø 1
An alternative to this change would have been to remove the ā€œsubsettingā€ step entirely, and to directly use the entire
repository.pex
in consumers (even if they only used a small fraction of it). But that would have the disadvantage that any change to any requirement would invalidate all consumers in a repository, regardless of whether the requirement was relevant to them, and would additionally violate hermeticity by allowing, for example, `test`s to observe resources that they donā€™t have dependencies on.
one of our goals is for
./pants test ::
to build only changed code, both locally and in CI, without any manual steps to calculate which targets to run on
āž• 1
it makes your CI setup simpler, and easier to introspect
but in order to get good cache hit rates, we need to avoid invalidating things unnecessarily
h
That would definitely be a performance/correctness tradeoff, but it's one we might want to let users make.
That said, let's see if 2.7.x improves things
w
@ambitious-student-81104:
2.7.0rc1
is out, and is likely stable enough for experimenting with. it makes the subsetting significantly faster/more-space-efficient
šŸ™Œ 1
āœ… 1