So I'm running into the following issue when runni...
# general
b
So I'm running into the following issue when running the pants generate-lockfiles goal. The following error occurs:
Expected sha256 hash of 2c040ffa5d5308c0e953c6f5b1dad0aa640665e6684d14404424b7d2fb9d9f24 when downloading pyyaml but hashed to 6516bdba655fe2470973f7182220cb89414bdbc8b8ed7f19083412cc33efe9df.
This particular package was built from the sdist using a gcc7/python38 runtime in a separate process. This is because the system architecture we are using does not have a prebuilt binary for it. This package is then uploaded to our artifactory instance. As a part of our resolves we have it added into a requirements.txt. A pip download of the package completes without error, but the generate-lockfile goal does not. This appears to be similar to the following issue https://github.com/pantsbuild/pants/issues/17626, but not exactly the same. Since these are built directly from the sdists rather than the github source I don't have a whole lot of control on how the build occurs. Details: pants 2.18.rc4, scie-pants 0.10.4. Runtime used to compile the pyyaml uses gcc7, python 3.8.16. Current environment for pants uses the runtime defined in the ptext. The python version for the
python3
command is in a virtualenv and uses python 3.8.16
Also note, this did work in the 2.17 release.
b
Some background questions: 1. Which version of pyyaml? 2. Do you customise the version of PEX in your pants.toml? 3. Does your repo include either of those hashes as text anywhere?
b
Update on this: We had set of two repos which contained the same PyYaml artifact. Since the hash is computed using the location (URL) plus the content, the sha of the downloaded artifact did not match the pip hash computation. So it was a race to see which repo would download the artifact, put it into the pip cache, and see if the checksums worked. We do need to use both repos, so if there is a way to work around this, it would be good to know. I'm not sure how the system will work in the case of primary and mirror indexes if what we are seeing is the case.
So the version didn't matter, but rather which repo it happened to get the wheel from. But like I said, this means that mirrors may run into an issue if the calculation includes remote location.
@broad-processor-92400 The pyyaml version was 3.13 and was built from the .zip. We do no customizations on the PEX in the pants.toml. And the repos don't have those hashes as text files. It is a metadata property on the artifact, but looking at pip source code I don't see where that is computed.
b
Hm, I was under the impression the hash is just of the file, and the source doesn’t matter. Can you manually download both artifacts and confirm that they have the same hash?
b
lemme try that really quick.
Copy code
jmello@jmello-718:~/Downloads$ pip hash one/PyYAML-3.13-cp38-cp38-linux_x86_64.whl
one/PyYAML-3.13-cp38-cp38-linux_x86_64.whl:
--hash=sha256:727bb5cd2a51d2a8cb1f2b096f05ceb998144d70d238639056a2eb62e90a944a
jmello@jmello-718:~/Downloads$ pip hash two/PyYAML-3.13-cp38-cp38-linux_x86_64.whl
two/PyYAML-3.13-cp38-cp38-linux_x86_64.whl:
--hash=sha256:2c040ffa5d5308c0e953c6f5b1dad0aa640665e6684d14404424b7d2fb9d9f24
exact same artifact published to both repos at the same time
I won't discount some goofiness with artifactory either. But at this point it's weird
b
Hm, can you hash the file outside of pip just to confirm that the files are exactly identical?
b
sure.
b
Actually, never mind: `pip hash`'s implementation is here: https://github.com/pypa/pip/blob/fd77ebfc742de4d76ff976de22e86d116e0faad3/src/pip/_internal/commands/hash.py#L40-L59 It is just hashing the file contents, so given the output is different for those two files, that strongly suggests the wheel in the two index repos are different.
b
roger on that. just verified that this is the case and they are not the same. I'm really stumped and am going to chalk it up to artifactory. These were uploaded by twine at the same time from the same artifact. But it would still be interesting to see what happens when a primary and mirror are used for the generate-lockfiles target though.
b
If you wanted to dig into the differences, (you already be aware of this)
whl
files are zips so one can unzip and compare the contents directly. (Although it could be other issues like different compression settings or something, if artifactory recompresses them.)
b
yeah, I would, but I'm really pressed for time. I do think it may be due to artifactory. It does a scan against the wheels uploaded and it may be doing an extract and re-zip
👍 1