< enough analyst 54434> can `pex tools extract` be convinced Pants #development

<@U04S45AHA>: can `pex-tools … extract` be convinc...

witty-crayon-22786

08/25/2021, 9:32 PM

@enough-analyst-54434: can

pex-tools … extract

be convinced to extract for multiple interpreters at once?

witty-crayon-22786

08/26/2021, 2:58 AM

ok, nevermind: had another realization, and won’t need to use

extract

enough-analyst-54434

08/26/2021, 4:34 PM

Ok. The answer, though, was no with the current code. You'd have to add a flag to the tool and do some code.

witty-crayon-22786

08/26/2021, 5:20 PM

for posterity, my realization was that the

repository.pex

is not the issue (which should have been obvious as of Monday, but it took some time to sink in): rather, the fact that we computed so many overlapping subsets of it.

witty-crayon-22786

08/26/2021, 5:21 PM

in TC, the

repository.pex

is ~50MB, but 94 overlapping subsets of that added up to 2.8GB =P

witty-crayon-22786

08/26/2021, 5:22 PM

so will skip the extract-to-wheels step, and only decompose the

requirements.pex

subsets

witty-crayon-22786

08/26/2021, 5:27 PM

@enough-analyst-54434: is

pex-tools _…_ graph

impacted by the interpreter used to extract? i.e., might different interpreters result in different graphs?

enough-analyst-54434

08/26/2021, 6:01 PM

Yes, ~all pex tools build a PEX and, instead of running it, as it to resolve(). That means those tools all act on the interpreter you're running the tool with.

enough-analyst-54434

08/26/2021, 6:01 PM

https://github.com/pantsbuild/pex/blob/main/pex/tools/main.py#L90-L100

enough-analyst-54434

08/26/2021, 6:05 PM

I'm missing how any cache-friendly scheme can get around extracting wheels. Are you just manually extracting subsets of the zip by knowing about

.deps/<whl name>/

directory structure in the PEX file?

witty-crayon-22786

08/26/2021, 6:08 PM

--pex-repository=.. --no-transitive $requirement

to build single-entry PEXes for all requirements in the closure of your root requirements is what i had been doing. have it mostly working, but am trying to see whether i need to special case

local_only

witty-crayon-22786

08/26/2021, 6:14 PM

but yea:

graph

being interpreter-specific makes sense at this juncture… because creating an interpreter/platform-oblivious graph is essentially what you have needed to patch

pip

to do, right?

witty-crayon-22786

08/26/2021, 6:16 PM

Are you just manually extracting subsets of the zip by knowing about
.deps/<whl name>/
directory structure in the PEX file?

i thought about doing this, but i figured that I’d also need to manipulate the PEX-INFO.

enough-analyst-54434

08/26/2021, 6:46 PM

Oh right, forgot about the PEX_PATH hack. Yeah - that's a good way to experiment on this.

👍 1

witty-crayon-22786

08/26/2021, 6:57 PM

@enough-analyst-54434: if a particular requirement has no distribution, does that also mean it will never have dependencies rendered via

graph

? for example, for the

typing

module

witty-crayon-22786

08/26/2021, 7:03 PM

(i expect that that is the case, because afaik, the only way that dependencies could be introduced would be via the distribution)

enough-analyst-54434

08/26/2021, 7:10 PM

but yea:
graph
being interpreter-specific makes sense at this juncture… because creating an interpreter/platform-oblivious graph is essentially what you have needed to patch
pip
to do, right?

No, these are at different levels. Here a PEX file contains a concrete set of dists. Those may satisfy one or more interpreters if any of the flags used to build the PEX admitted multiple pythons (

--python X --pythonY

--interpreter-constraint ...

--platform Z --platform W

, or any combination of those), then it can contain more than one graph. So the graph tool could be expanded to walk the root requirements using all the embdded dists and ignoring tags and env markers. There is no Pip involved in this process today as-is, nor in this feature expansion. Fundamentally, all pex tools are runtime tools. They work with a concrete PEX file. The pip thing you're referring to is a buildtime feature that requires pip and does not download all the dists, it just downloads 1 dist for each dist family (i.e.: 1 pytest dist, 1 requests dist, etc.), but it ignores tags and env markers at resolve time, which is different and can include more things than in a PEX. You don't care about those things though for your work. Your work works on realized PEX files and only wants to deal with the real dists contained therin. Concreteley, Given

pex --interpreter-constraint ">=3.7,<3.9" foo

ANd foo's dependencies are: "bar==1; python_version == '3.7'" "bar==2; python_version == '3.8'" "baz; sys_platform == 'darwin'" The PEX is built on Linux and that Linux machine has both a Python 3.7 and Python 3.8 on the PATH visible to it. The PEX file will contain one foo dist, and two bar dists. The graph tool is run against the PEX file also on Linux. The current graph tool will output "foo X -> bar 1" when run with Python 3.7 and "foo X -> bar 2" when run with Python 3.8. The graph tool could be expanded to output ~ "foo X -> ["bar 1; python_version == '3.7'", "bar 2; python_version == '3.8'"]". The pip feature will also include baz and its transitive dependencies.

witty-crayon-22786

08/26/2021, 7:12 PM

got it: this all makes sense: thanks!

witty-crayon-22786

08/26/2021, 7:13 PM

So the graph tool could be expanded to walk the root requirements using all the embdded dists and ignoring tags and env markers.

this seems like it would be equivalent to walking the graph of https://github.com/pantsbuild/pants/blob/3319779e702f5ed616ec7f54d73159932f18fc4f/src/python/pants/backend/python/util_rules/pex.py#L909-L917 …?

witty-crayon-22786

08/26/2021, 7:15 PM

… except that as you said, all

pex-tools

deal with only one interpreter, and so that would mean that the

PexResolveInfo

will only end up rendering info about distributions relevant to that interpreter? (i’ll try this out to confirm)

enough-analyst-54434

08/26/2021, 7:16 PM

Correct. It will use env markers so it will prune to those.

enough-analyst-54434

08/26/2021, 7:17 PM

You'll be missing subgraphs for some cases.

enough-analyst-54434

08/26/2021, 7:18 PM

In my example just change the baz env marker to select only python 3.8. Now run your rule stuff against python 3.7

enough-analyst-54434

08/26/2021, 7:18 PM

You'll miss all baz transitive deps.

witty-crayon-22786

08/26/2021, 7:18 PM

yep, got it.

enough-analyst-54434

08/26/2021, 7:19 PM

You'll know baz exists, since you ignore env markers, you just miss its dep graph.

enough-analyst-54434

08/26/2021, 7:19 PM

If you're just trying to prove this out though, this all works well enough to get numbers.

witty-crayon-22786

08/26/2021, 7:20 PM

yep. and

local_only

is only for a single interpreter anyway, and that’s where we need this most.

2 Views

Open in Slack

Previous Next