I think i m running into more issues with pex http cache beh Pants #pex

I think i'm running into more issues with pex http...

flat-zoo-31952

11/02/2020, 1:48 PM

I think i'm running into more issues with pex http cache behavior. It seems the first time I work on a target in a day or two, some kind of cache miss is triggered, and I'm back to 5-10 min of downloading again (which I can see via the network indicators). I have

[python-setup].resolver_http_cache_ttl = 0

set. This is hard to reproduce because it only happens once or so in a given day, usually when i'm not expecting it. From previous logs, I suspect this is the remote TTL from the server. I'm not sure whether this is a vendored-pip thing, or a pex thing, but I really really don't think the HTTP cache TTL should apply to immutable resources like this. This is important to me because I'm often working with a slowish connection and dealing with

opencv

and

torch

which are just enormous packages that i really want to avoid downloading as much as possible.

jolly-midnight-72759

11/02/2020, 2:05 PM

As a work around, you could run a local pypi cache using

devpi

, https://www.devpi.net/. I run a local server with an index that passes through to our pypi server. This caches locally all the packages and only downloads to my mac when a new version is added to the server.

jolly-midnight-72759

11/02/2020, 2:06 PM

I run it in a venv and configured it to keep the cached files in a persistent location.

👀 1

flat-zoo-31952

11/02/2020, 2:14 PM

i've tried this but it doesn't seem to work for me

flat-zoo-31952

11/02/2020, 2:14 PM

i mean...i've tried the docker image version of this, it doesn't seem to actually cache the files

jolly-midnight-72759

11/02/2020, 2:28 PM

Later today, I'll get you more details on my setup. I basically followed the Quickstart tutorial. In addition to the default root index, I set up a "dev" index with our pypi server as the base. I then pointed my pip to the my local dev index. The whole server was pointed to a local directory on my laptop's hard drive.

flat-zoo-31952

11/02/2020, 2:37 PM

Cool, this seems like a good workaround!

flat-zoo-31952

11/02/2020, 2:38 PM

I honestly think pex may have inherited this behavior from pip, so asking to "fix" it is more of a feature request

flat-zoo-31952

11/02/2020, 2:39 PM

I think I just hit it much more often with pants/pex because i don't have a stable venv to essentially cache all my dependencies

witty-crayon-22786

11/02/2020, 5:51 PM

is this in a CI environment?

witty-crayon-22786

11/02/2020, 5:52 PM

the PEX and pip caches go into

Copy code

~/.cache/pants/named_caches

… so would see if maybe they’re being cleared?

flat-zoo-31952

11/02/2020, 6:06 PM

not in a CI environment, on my workstation

flat-zoo-31952

11/02/2020, 6:07 PM

caches are not being cleared

witty-crayon-22786

11/02/2020, 6:08 PM

witty-crayon-22786

11/02/2020, 6:09 PM

the first level of caching here is that Pants should not be invoking pex at all if your requirements haven’t changed

witty-crayon-22786

11/02/2020, 6:09 PM

are you seeing what looks like appropriate behavior at that level?

flat-zoo-31952

11/02/2020, 6:15 PM

i think so...it's hard to say because my requirements change fairly frequently

witty-crayon-22786

11/02/2020, 6:15 PM

ok, got it.

flat-zoo-31952

11/02/2020, 6:16 PM

i suspect pants is getting it right based on what i've seen, except that sometimes it seems to resolve

constraints.txt

differently for different goals, if that makes sense?

witty-crayon-22786

11/02/2020, 6:17 PM

@flat-zoo-31952: it’s not tremendously practical, but you can set

--pex-verbosity

to enable logging in general

witty-crayon-22786

11/02/2020, 6:17 PM

https://www.pantsbuild.org/docs/reference-pex

flat-zoo-31952

11/02/2020, 6:17 PM

yeah i can do that until i learn more

witty-crayon-22786

11/02/2020, 6:18 PM

@flat-zoo-31952: re: different goals: it needs to do an additional resolve to get the “tool” for the goal, but that shouldn’t invalidate the underlying resolve

flat-zoo-31952

11/02/2020, 6:20 PM

i can look into this, i can also look back through some of the logs i gathered when digging through this before

resolver_http_cache_ttl

was added

flat-zoo-31952

11/02/2020, 7:38 PM

well i generated a 300 MB log file with a single attempt to resolve constraints.txt (which i have no idea why this was even triggered...my constraints.txt hasn't changed in a week or more

witty-crayon-22786

11/02/2020, 7:38 PM

the presence of the option changes the cache key unfortunately.

witty-crayon-22786

11/02/2020, 7:39 PM

so it will re-run pex.

flat-zoo-31952

11/02/2020, 7:39 PM

it appears that verbosity 5 was too much though 😄

witty-crayon-22786

11/02/2020, 7:39 PM

but assuming that

~/.cache/pants/named_caches

contains relevant content, it should re-run quickly

witty-crayon-22786

11/02/2020, 7:39 PM

if it doesn’t re-run quickly, then that might demonstrate a/the bug

flat-zoo-31952

11/02/2020, 7:40 PM

yeah it took ~5 min

flat-zoo-31952

11/02/2020, 7:40 PM

but a 300 MB log is too much to look through

witty-crayon-22786

11/02/2020, 7:41 PM

yea.

flat-zoo-31952

11/02/2020, 7:42 PM

trying verbosity 3

flat-zoo-31952

11/02/2020, 7:42 PM

since changing that invalidates the cache, it makes it easier to reproduce this on command

witty-crayon-22786

11/02/2020, 7:42 PM

i’m wrapping up https://github.com/pantsbuild/pants/issues/10847 and can then take a look at this.

👍 1

witty-crayon-22786

11/02/2020, 8:03 PM

@flat-zoo-31952: i think that the thing that would be most helpful in here would be a smoking gun “this log line doesn’t make sense: why are we re-downloading

$libx

here when we did

$y

minutes ago”

witty-crayon-22786

11/02/2020, 8:04 PM

that or something that repros the issue… i can spend some time today hunting for one.

flat-zoo-31952

11/02/2020, 8:04 PM

yeah i'm looking for that, but i'm not seeing that in these logs...it seems to be using the cache...but still taking 4 min

witty-crayon-22786

11/02/2020, 8:05 PM

there are at least two caches within pex: one is for downloaded artifacts (source distributions); and another is for built wheels

flat-zoo-31952

11/02/2020, 8:05 PM

i understand, no pressure on you guys....this is hard to reproduce and analyze

flat-zoo-31952

11/02/2020, 8:06 PM

i encounter this as mostly an annoying wtf moment in the mornings when i don't really have the time sit down and look at it

witty-crayon-22786

11/02/2020, 8:07 PM

totally.

flat-zoo-31952

11/02/2020, 8:09 PM

however it's still better then when i cache miss on the

RUN pip install -r requirements.txt

in dockerfiles, so i'll try to keep that in perspective 😁

witty-crayon-22786

11/02/2020, 8:09 PM

😅

flat-zoo-31952

11/02/2020, 8:16 PM

This took about 4 min...seems like it is using the cache for most items, but i guess i don't really understand why it takes so long

witty-crayon-22786

11/02/2020, 8:30 PM

@flat-zoo-31952: i don’t really have a great understanding of pex’s logging, but i’ll spend some time today to get that, and then see what sense i can make of that log.

witty-crayon-22786

11/02/2020, 8:41 PM

but yea, superficially, the amount of time i’m seeing with an empty Pants process cache and a full Pex cache is surprising. so i’ll investigate that first.

witty-crayon-22786

11/02/2020, 11:22 PM

@flat-zoo-31952: i’ve filed https://github.com/pantsbuild/pex/issues/1094 about this… there is a bunch of unexpected time there. will dig a bit deeper, since if the smaller example is slow, then it doesn’t surprise me that the larger one is

🙏🏻 1

witty-crayon-22786

11/02/2020, 11:24 PM

we spend a lot of time in Pants to try to avoid invoking tools by caching them in the process cache, but i think that the folks we’ve been working so far must have more stable requirements than you do

flat-zoo-31952

11/03/2020, 12:38 AM

That makes sense...and the instability of my requirements is likely to remain perpetually. This is why I think having some kind of intermediate form of resolution (like lockfiles) makes sense. But I'm probably going to need to do a better job of elaborating my use cases

jolly-midnight-72759

11/03/2020, 2:12 AM

Just rechecked my configs. I had set up my

dev

index with an import of all the wheels and other packages from our company pypi. The root one had

Copy code

"mirror_url": "<https://pypi.org/simple/>",
        "mirror_web_url_fmt": "<https://pypi.org/project/{name}/>",

set up. It was the one that caches everything locally.

👍🏻 1

witty-crayon-22786

11/04/2020, 1:18 AM

i spent some more time looking at this, and it really seems like there is inherent overhead in the resolve re-running, even with 100% http cache hits for wheels.

witty-crayon-22786

11/04/2020, 1:20 AM

at least some of that time is network time to re-list pypi: a native lockfile implementation might allow the listing to be skipped as long as the lockfile could be used to bypass the resolve entirely (which it should be able to if the inputs had not changed).

witty-crayon-22786

11/04/2020, 1:21 AM

but if any requirement had actually changed (as seems to be the case in this thread), then the lockfile could not be used, and resolve performance would come back into play

jolly-midnight-72759

11/04/2020, 1:22 AM

./pants devpi start

witty-crayon-22786

11/04/2020, 1:28 AM

@jolly-midnight-72759: my hope is that we can fix any glaring performance issues here at a fundamental level such that people mostly don’t need to run proxies

👍🏽 1

witty-crayon-22786

11/04/2020, 1:29 AM

@flat-zoo-31952: one thing i note from your log is that the

Cache-Control: max-age

is set to 0… which actually disables using that cache. is that intentional? the default in

2.0.0

is one hour

jolly-midnight-72759

11/04/2020, 1:30 AM

I thought you had identified a fundamental step that could not be avoided or worked on.

witty-crayon-22786

11/04/2020, 1:30 AM

haven’t given up yet, heh

👖 1

flat-zoo-31952

11/04/2020, 1:49 AM

Oh... I thought putting 0 meant to never expire the cache

flat-zoo-31952

11/04/2020, 1:51 AM

What's the value for "please just use cache as much as you can"

witty-crayon-22786

11/04/2020, 2:09 AM

A "very large number", afaik. Would check back in on the TTL github issue, but if you look at pip's behavior when max_age==0, it just disables the cache.

witty-crayon-22786

11/04/2020, 2:09 AM

Away from my computer, but can point to the line in pip when I get back.

flat-zoo-31952

11/04/2020, 2:14 AM

no need, i'll take your word for it

witty-crayon-22786

11/04/2020, 6:44 PM

for posterity: https://github.com/pantsbuild/pex/blob/7509a895ac5f5f0f67daff31f3d8d3a6e75c9f30/pex/vendor/_vendored/pip/pip/_vendor/cachecontrol/controller.py#L134-L136

🙏🏻 1

witty-crayon-22786

11/06/2020, 12:15 AM

@flat-zoo-31952: one other question on this: are you seeing re-resolves on different binaries while running

package

run

? or is this during

test

flat-zoo-31952

11/06/2020, 12:27 AM

Both, I think...

witty-crayon-22786

11/06/2020, 1:43 AM

ok, thanks

witty-crayon-22786

11/06/2020, 1:44 AM

i filed https://github.com/pantsbuild/pants/issues/11105 , which would significantly reduce the number of resolves that we do without sacrificing accuracy

witty-crayon-22786

11/06/2020, 1:44 AM

(…and submitted it to the prioritization survey!)

👍🏻 1

Open in Slack

Previous Next