Ugh, this again, in CI: `psutil/_psutil_common.c:9...
# development
h
Ugh, this again, in CI:
psutil/_psutil_common.c:9:10: fatal error: Python.h: No such file or directory
Why is it building psutil from sources anyway? There are wheels: https://pypi.org/project/psutil/5.9.8/#files
and the “build wheels” step in this job does use the appropriate psutil wheel (psutil-5.9.8-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl)
Hmm, this may be because this job runs in a container with the image
<http://quay.io/pypa/manylinux_2_28_x86_64:latest|quay.io/pypa/manylinux_2_28_x86_64:latest>
so we are at the mercy of whatever is in that
:latest
which was updated 4 days ago…
But weirdly I can’t reproduce this in the same image it fails on in prod
😢 1
b
Hm, potential hints: • a bump-version build on
2.26.x
passed: https://github.com/pantsbuild/pants/pull/22226 • a similar bump-version on main (2.27) failed: https://github.com/pantsbuild/pants/pull/22227
h
My debugging so far shows that
yum install -y python3-devel
“fixes” this, but I’m not really satisfied with that. I want to get to the bottom of this, and particularly why I can’t reproduce it in the same image on an x86 machine.
So I’m doing my favorite thing: ssh’ing into a running container in CI…
😭 1
b
thanks for digging
h
Both runs you linked to above ran on the same image version (2e6c7d1d84)
And the same runner image
And the same docker version
Uuuuuuuuuuugh
Running the same command a second time after the failure passes…
It’s one of those bugs…
So now my suspicion falls on the awesomely-numbered https://github.com/pantsbuild/pants/pull/22222
😄 1
which is present in main but not on 2.26.x
From my digging so far, this seems to be because in some specific confluence of events we’re using
/usr/bin/python3.11
(an old 3.11.11) instead of
/opt/python/cp311-cp311/bin/python
, which is a newer 3.11.12, and the one we actually want (and that has that header)
But I’d like to dig still deeper and see what’s up with this interpreter selection
b
Ah good question re 22160. I've opened https://github.com/pantsbuild/pants/pull/22228 that is that PR + a change to force wheel building.
h
Well that passed!
And can confirm that reverting https://github.com/pantsbuild/pants/pull/22222 fixes this
b
Good find. I can merge 22160 to paper over the potential pex bug? Or shall we revert 22222 for now?
h
Papering over is fine, I have what I need to investigate this
I can now reproduce it in a local docker container, so at least that level of sanity has been restored
😅 1
oooooohhh
That PR removed the linux wheels from the lockfile for some reason
👍 1
So this likely isn’t due to a change in pex, but due to our introducing a need to build the sdist in the first place
And we need to figure out why that happened
It is still slightly mysterious why this fails with that gcc error only the first time you run it on a clean cache, and if you rerun it succeeds. But as far as I can tell it’s because pex is building the wheel for two 3.11 interpreters, one succeeds, the other fails, but on subsequent run the successful wheel is cached and then used
So I am going to consider this figured out
Well, apart from figuring out how we messed up the lockfile
b
nice debugging, thanks for doing that
h
@curved-manchester-66006 see my question here but I’m trying to figure out why that PR removed those wheels from the lockfile. Interestingly, syncing to before that PR and regenerating does not restore them
c
(I think I figured this out; but will take a while before I can write it all up tonight)
🙏 2