<@U04S45AHA>: can `pex-tools … extract` be convinc...
# development
w
@enough-analyst-54434: can
pex-tools … extract
be convinced to extract for multiple interpreters at once?
ok, nevermind: had another realization, and won’t need to use
extract
e
Ok. The answer, though, was no with the current code. You'd have to add a flag to the tool and do some code.
w
for posterity, my realization was that the
repository.pex
is not the issue (which should have been obvious as of Monday, but it took some time to sink in): rather, the fact that we computed so many overlapping subsets of it.
in TC, the
repository.pex
is ~50MB, but 94 overlapping subsets of that added up to 2.8GB =P
so will skip the extract-to-wheels step, and only decompose the
requirements.pex
subsets
@enough-analyst-54434: is
pex-tools _…_ graph
impacted by the interpreter used to extract? i.e., might different interpreters result in different graphs?
e
Yes, ~all pex tools build a PEX and, instead of running it, as it to resolve(). That means those tools all act on the interpreter you're running the tool with.
I'm missing how any cache-friendly scheme can get around extracting wheels. Are you just manually extracting subsets of the zip by knowing about
.deps/<whl name>/
directory structure in the PEX file?
w
--pex-repository=.. --no-transitive $requirement
to build single-entry PEXes for all requirements in the closure of your root requirements is what i had been doing. have it mostly working, but am trying to see whether i need to special case
local_only
.
but yea:
graph
being interpreter-specific makes sense at this juncture… because creating an interpreter/platform-oblivious graph is essentially what you have needed to patch
pip
to do, right?
Are you just manually extracting subsets of the zip by knowing about 
.deps/<whl name>/
 directory structure in the PEX file?
i thought about doing this, but i figured that I’d also need to manipulate the PEX-INFO.
e
Oh right, forgot about the PEX_PATH hack. Yeah - that's a good way to experiment on this.
👍 1
w
@enough-analyst-54434: if a particular requirement has no distribution, does that also mean it will never have dependencies rendered via
graph
? for example, for the
typing
module
(i expect that that is the case, because afaik, the only way that dependencies could be introduced would be via the distribution)
e
but yea: 
graph
 being interpreter-specific makes sense at this juncture… because creating an interpreter/platform-oblivious graph is essentially what you have needed to patch 
pip
 to do, right?
No, these are at different levels. Here a PEX file contains a concrete set of dists. Those may satisfy one or more interpreters if any of the flags used to build the PEX admitted multiple pythons (
--python X --pythonY
,
--interpreter-constraint ...
,
--platform Z --platform W
, or any combination of those), then it can contain more than one graph. So the graph tool could be expanded to walk the root requirements using all the embdded dists and ignoring tags and env markers. There is no Pip involved in this process today as-is, nor in this feature expansion. Fundamentally, all pex tools are runtime tools. They work with a concrete PEX file. The pip thing you're referring to is a buildtime feature that requires pip and does not download all the dists, it just downloads 1 dist for each dist family (i.e.: 1 pytest dist, 1 requests dist, etc.), but it ignores tags and env markers at resolve time, which is different and can include more things than in a PEX. You don't care about those things though for your work. Your work works on realized PEX files and only wants to deal with the real dists contained therin. Concreteley, Given
pex --interpreter-constraint ">=3.7,<3.9" foo
ANd foo's dependencies are: "bar==1; python_version == '3.7'" "bar==2; python_version == '3.8'" "baz; sys_platform == 'darwin'" The PEX is built on Linux and that Linux machine has both a Python 3.7 and Python 3.8 on the PATH visible to it. The PEX file will contain one foo dist, and two bar dists. The graph tool is run against the PEX file also on Linux. The current graph tool will output "foo X -> bar 1" when run with Python 3.7 and "foo X -> bar 2" when run with Python 3.8. The graph tool could be expanded to output ~ "foo X -> ["bar 1; python_version == '3.7'", "bar 2; python_version == '3.8'"]". The pip feature will also include baz and its transitive dependencies.
w
got it: this all makes sense: thanks!
So the graph tool could be expanded to walk the root requirements using all the embdded dists and ignoring tags and env markers.
this seems like it would be equivalent to walking the graph of https://github.com/pantsbuild/pants/blob/3319779e702f5ed616ec7f54d73159932f18fc4f/src/python/pants/backend/python/util_rules/pex.py#L909-L917 …?
… except that as you said, all
pex-tools
deal with only one interpreter, and so that would mean that the
PexResolveInfo
will only end up rendering info about distributions relevant to that interpreter? (i’ll try this out to confirm)
e
Correct. It will use env markers so it will prune to those.
You'll be missing subgraphs for some cases.
In my example just change the baz env marker to select only python 3.8. Now run your rule stuff against python 3.7
You'll miss all baz transitive deps.
w
yep, got it.
e
You'll know baz exists, since you ignore env markers, you just miss its dep graph.
If you're just trying to prove this out though, this all works well enough to get numbers.
w
yep. and
local_only
is only for a single interpreter anyway, and that’s where we need this most.