We're suddenly seeing `./pants package ::` take a ...
# general
s
We're suddenly seeing
./pants package ::
take a lot longer than it used to do in CI. We have a case where it took 14 hours and crashed with
Copy code
35d72d/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 223, in _main                                        
    status = self.run(options, args)                                                                                              
  File "/root/.cache/pants/named_caches/pex_root/venvs/18c001f81959497d649cc48967f7f8493f48767e/cba9b10390762edd539c75c9f4b40c5643
35d72d/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 180, in wrapper                                       
    return func(self, options, args)                                                                                              
  File "/root/.cache/pants/named_caches/pex_root/venvs/18c001f81959497d649cc48967f7f8493f48767e/cba9b10390762edd539c75c9f4b40c5643
35d72d/lib/python3.10/site-packages/pip/_internal/commands/download.py", line 130, in run                                         
    requirement_set = resolver.resolve(                                                                                           
  File "/root/.cache/pants/named_caches/pex_root/venvs/18c001f81959497d649cc48967f7f8493f48767e/cba9b10390762edd539c75c9f4b40c5643
35d72d/lib/python3.10/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 121, in resolve                        
    self._result = resolver.resolve(                                                                                              
  File "/root/.cache/pants/named_caches/pex_root/venvs/18c001f81959497d649cc48967f7f8493f48767e/cba9b10390762edd539c75c9f4b40c5643
35d72d/lib/python3.10/site-packages/pip/_vendor/resolvelib/resolvers.py", line 453, in resolve                                    
    state = resolution.resolve(requirements, max_rounds=max_rounds)                                                               
  File "/root/.cache/pants/named_caches/pex_root/venvs/18c001f81959497d649cc48967f7f8493f48767e/cba9b10390762edd539c75c9f4b40c5643
35d72d/lib/python3.10/site-packages/pip/_vendor/resolvelib/resolvers.py", line 364, in resolve                                    
    raise ResolutionTooDeep(max_rounds)                                                                                           
pip._vendor.resolvelib.resolvers.ResolutionTooDeep: 2000000
I think this is due to use changing dependencies from latest to google-api-core~=2.8.2 google-cloud-bigquery-storage~=2.16.0 google-cloud-bigquery~=3.3.3 google-cloud-pubsub~=2.13.7 google-cloud-secret-manager~=2.12.4 google-cloud-storage~=2.7.0 google-cloud-dataflow-client~=0.5.5 so that it works with apache-beam==2.44.0.
We did this because the Google APIs wanted
protobuf>=3.19.5
but Apache Beam wants
protobuf<3.19.5
w
are you using a lockfile?
s
We are not currently using a lockfile
We made those changes so that
./pants export ::
would work again. We didn't realize we broke it initially when we added apache-beam
I don't know if this helps but the pex failing to package has
Copy code
yjabri@remote-yjabri-default:~/archipelago$ ./pants dependencies --transitive archipelago/:cli
//:reqs#SQLAlchemy
//:reqs#db-dtypes
//:reqs#duckdb
//:reqs#duckdb-engine
//:reqs#google-api-core
//:reqs#google-cloud-bigquery
//:reqs#google-cloud-pubsub
//:reqs#google-cloud-secret-manager
//:reqs#requests
//:reqs#sqlalchemy-bigquery
//:reqs#typer
//:reqs-dev#setuptools
//:reqs-dev#types-requests
w
We are not currently using a lockfile
i would definitely recommend using a lockfile. that would mean that you only pay the cost of resolving once when you build the lockfile, rather than every time you package
h
Well, but it shouldn't take 14 hours to generate the lockfile
s
I want to use a lockfile. We haven't adopted one yet because it introduced problems for folks with x86 Macs around tensorflow / tensorflow-macos dependencies (funny how the tides have turned, M1s were fine)
Is there anything I can pass to the package command to get more insight into where it's maybe getting stuck? Passing
-ldebug
like
./pants package -ldebug archipelago/:cli
doesn't provide too much more information
If I revert* the dependencies back to
Copy code
google-api-core~=2.11.0
google-cloud-bigquery-storage~=2.16.2
google-cloud-bigquery~=3.3.5
google-cloud-pubsub~=2.13.10
google-cloud-secret-manager~=2.12.6
google-cloud-storage~=2.5.0
google-cloud-dataflow-client~=0.8.2
it packages in about 40 seconds
h
hmm a lockfile should be able to support multiple platforms normally, but tensorflow is known-to-be-crazy
s
The way I found the updated versions was by leaving the version numbers off and then exporting and doing a
./dist/export/.../bin/pip freeze | grep google-
h
Oh wow, so something is really pathological there, 40 seconds to 14 hours...
s
So if I use
==
instead of
~=
it works too
I couldn't articulate why, but that feels like a bad idea
I believe this repo https://github.com/yjabri/slow-pants reproduces the issue
By no means is that example minimal. I'll try to reduce the dependencies / add a description of what's going wrong in the readme and post that when I have a chance
h
That
pip._vendor.resolvelib.resolvers.ResolutionTooDeep
says that pip itself is failing on this resolve
That implies that the resolve you want is impossible
I would suggest cutting Pants out entirely for debugging purposes and trying this resolve with pure pip
But this seems like not a pants issue per se?
s
Sorry for the delay. To clarify, the original set of versions we requested was impossible, that's fair. What's particular about pants in this scenario is that after updating the requirements to be compatible, pants takes a really long time to package. We solved for this by specifying
==
in the requirements.txt file instead of
~=
for the packages listed above. If I have to throw a wild guess out, (this is a wild guess so it's probably dumb), package resolution is a really hard problem with
~=
leading to an exponential number of options and maybe whatever branch cutting algo in place doesn't work well with this set of requirements. The surprising thing to me as a user of pants is how sensitive build times are to (what I'd consider small) variations of versions
d
Interesting, we're running into this too, specifically with
Copy code
google-cloud-bigquery-storage
if i remove that dependency, package times are around 2 mins, if I add it back, it never finishes
h
And what if you pin that dep in requirements.txt to a specific single version?
d
it was always pinned but it was pinned to an incompatible version! I pinned it to something compatible with apache-beam and also pinned apache-beam to a more specific version. Those 2 together fixed it
I'm surprised pip doesn't error earlier letting you know it's having issues resolving your requested deps... I wonder if there's a pip setting to adjust how long it tries to resolve for before failing
h
Newer pip might be better at this. In 2.16 you can force a higher version via https://www.pantsbuild.org/v2.16/docs/reference-python#pip_version
d
ah thx for the link, will file it away for when we start build system integration 🙂 Got some cleanup to do of our monorepo before we can jump ship
Seems pip23.1 probably fixes resolution taking forever: https://discuss.python.org/t/announcement-pip-23-1-release/25844
❤️ 1