Using pants 2 here... if I wanted to ensure that a...
# general
a
Using pants 2 here... if I wanted to ensure that a 3rd party dependency is installed before another 3rd party dependency, would this be the correct syntax in the
PANTS_2
file:
Copy code
python_requirement(
  name='gdal',
  requirements=['gdal==3.6.4'],
  dependencies=[":numpy"],)
???
r
Yeah it should work. I assume you have defined
numpy
similarly somewhere
a
Yes:
Copy code
python_requirement(
  name='numpy',
  requirements=['numpy==1.23.5'])
However, I can tell that it's not honoring the order because the
gdal_array
functions are not getting installed (they typically do not if
numpy
hasn't been installed at the time that
gdal
is being installed).
r
And numpy isn't a transitive dependency for gdal?
a
Not sure what you mean by this question...
r
Normally a python package has its own dependencies (called transitive dependency) as defined in the setup.py or pyproject.toml. In this case most probably
gdal_array
doesn't define its dependency on
numpy
explicitly. Hence you are forcing it to depend on
numpy
this way. I don't know about
gdal_array
and hence I was asking.
a
Ah. Well, the
gdal_array
package gets built / compiled if
numpy
is available when
gdal
is installed.
numpy
is not a dependency inside of
gdal
(which is frustrating)... but if
numpy
happens to be installed, then the
array
functions (via
gdal_array
are built). Hence trying to force
numpy
to get installed before
gdal
is installed....
r
Does it say that it can't find
numpy
? I tried to directly do a pip install and it shows a warning and fails with error for some missing config
Copy code
× python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [86 lines of output]
      WARNING: numpy not available!  Array support will not be enabled
....
      raise gdal_config_error(e)
      gdal_config_error: [Errno 2] No such file or directory: 'gdal-config'

      Could not find gdal-config. Make sure you have installed the GDAL native library and development headers.
      [end of output]
a
How did you get this ☝️ output?
r
This is without pants. I just did
pip install gdal
a
ah
So, GDAL is a beast... hah... w/pants, it appears to be a 🐔 & 🥚 scenario. I pull GDAL and build it locally (it's a C++ library) and it installs correctly into the system
python (3.11)
. NOTE that this python already has
numpy
installed before GDAL is built. Then, trying to use pants to build, and the system GDAL and libraries are installed, but because it's a new virtual environment.. GDAL has to be reinstalled [but the environment variables and LD_LIBRARY environment variables ensure that GDAL is locatable]. The issue is that the virtual environment that pants builds doesn't install numpy before gdal... 😐
r
It looks like it needs more stuff than just numpy? You will have to feed that info to pants also. Although I am bit unsure how would you do that.
Maybe you need to set-up those
LD_LIBRARY
envs as described here https://www.pantsbuild.org/docs/reference-subprocess-environment#advanced-options Inside
pants.toml
Copy code
[subprocess-environment]
env_vars.add = ["<insert_LD_LIB_var>"]
a
Is there a way to get verbose outputs during this phase:
Copy code
... ⠋ 79.86s Building requirements.pex with 44 requirements
a
Tried many options, and not seeing additional detail as I would expect... 😞
Is there a way to say... install
numpy
before doing anything else?
b
Dependencies on a requirement are a pants-level thing, at run-time, saying that any time your first party uses
gdal
pants will also make sure that
numpy
is available. I don't know of a way to carefully control the environment in which pex (which pants uses for python requirements) runs pip to build the package into wheels (these wheels are then used as the actual runtime packages). I believe one option for finnicky package-building problems is to pre-build wheels/bdists, and publish them to an internal index, and have pants/pex consume those wheels, rather than the PyPI sdists
e
Confirming at a Pex level you definitely cannot say install x before y. Pex installs every distribution in isolation and only later combines them to form a venv or PEX file.
With PEP-518 (https://peps.python.org/pep-0518/) being firmly established now in the Python ecosystem there is no excuse for a project that needs something like numpy installed in advance to not use that, i.e. throw down a minimal pyproject.toml declaring the build requirements. It automatically just works with modern Pex, Pip and any other modern Python build tool and has no other impact on even the crustiest of setup.py. if the project(s) aren't open to doing this you can always fork (if the projects are open of course) and do it yourself and switch to a VCS requirement until they accept your patch.