https://pantsbuild.org/ logo
#pex
Title
f

flat-zoo-31952

11/02/2020, 1:48 PM
I think i'm running into more issues with pex http cache behavior. It seems the first time I work on a target in a day or two, some kind of cache miss is triggered, and I'm back to 5-10 min of downloading again (which I can see via the network indicators). I have
[python-setup].resolver_http_cache_ttl = 0
set. This is hard to reproduce because it only happens once or so in a given day, usually when i'm not expecting it. From previous logs, I suspect this is the remote TTL from the server. I'm not sure whether this is a vendored-pip thing, or a pex thing, but I really really don't think the HTTP cache TTL should apply to immutable resources like this. This is important to me because I'm often working with a slowish connection and dealing with
opencv
and
torch
which are just enormous packages that i really want to avoid downloading as much as possible.
j

jolly-midnight-72759

11/02/2020, 2:05 PM
As a work around, you could run a local pypi cache using
devpi
, https://www.devpi.net/. I run a local server with an index that passes through to our pypi server. This caches locally all the packages and only downloads to my mac when a new version is added to the server.
I run it in a venv and configured it to keep the cached files in a persistent location.
👀 1
f

flat-zoo-31952

11/02/2020, 2:14 PM
i've tried this but it doesn't seem to work for me
i mean...i've tried the docker image version of this, it doesn't seem to actually cache the files
j

jolly-midnight-72759

11/02/2020, 2:28 PM
Later today, I'll get you more details on my setup. I basically followed the Quickstart tutorial. In addition to the default root index, I set up a "dev" index with our pypi server as the base. I then pointed my pip to the my local dev index. The whole server was pointed to a local directory on my laptop's hard drive.
f

flat-zoo-31952

11/02/2020, 2:37 PM
Cool, this seems like a good workaround!
I honestly think pex may have inherited this behavior from pip, so asking to "fix" it is more of a feature request
I think I just hit it much more often with pants/pex because i don't have a stable venv to essentially cache all my dependencies
w

witty-crayon-22786

11/02/2020, 5:51 PM
is this in a CI environment?
the PEX and pip caches go into
Copy code
~/.cache/pants/named_caches
… so would see if maybe they’re being cleared?
f

flat-zoo-31952

11/02/2020, 6:06 PM
not in a CI environment, on my workstation
caches are not being cleared
w

witty-crayon-22786

11/02/2020, 6:08 PM
hm
the first level of caching here is that Pants should not be invoking pex at all if your requirements haven’t changed
are you seeing what looks like appropriate behavior at that level?
f

flat-zoo-31952

11/02/2020, 6:15 PM
i think so...it's hard to say because my requirements change fairly frequently
w

witty-crayon-22786

11/02/2020, 6:15 PM
ok, got it.
f

flat-zoo-31952

11/02/2020, 6:16 PM
i suspect pants is getting it right based on what i've seen, except that sometimes it seems to resolve
constraints.txt
differently for different goals, if that makes sense?
w

witty-crayon-22786

11/02/2020, 6:17 PM
@flat-zoo-31952: it’s not tremendously practical, but you can set
--pex-verbosity
to enable logging in general
f

flat-zoo-31952

11/02/2020, 6:17 PM
yeah i can do that until i learn more
w

witty-crayon-22786

11/02/2020, 6:18 PM
@flat-zoo-31952: re: different goals: it needs to do an additional resolve to get the “tool” for the goal, but that shouldn’t invalidate the underlying resolve
f

flat-zoo-31952

11/02/2020, 6:20 PM
i can look into this, i can also look back through some of the logs i gathered when digging through this before
resolver_http_cache_ttl
was added
well i generated a 300 MB log file with a single attempt to resolve constraints.txt (which i have no idea why this was even triggered...my constraints.txt hasn't changed in a week or more
w

witty-crayon-22786

11/02/2020, 7:38 PM
the presence of the option changes the cache key unfortunately.
so it will re-run pex.
f

flat-zoo-31952

11/02/2020, 7:39 PM
it appears that verbosity 5 was too much though 😄
w

witty-crayon-22786

11/02/2020, 7:39 PM
but assuming that
~/.cache/pants/named_caches
contains relevant content, it should re-run quickly
if it doesn’t re-run quickly, then that might demonstrate a/the bug
f

flat-zoo-31952

11/02/2020, 7:40 PM
yeah it took ~5 min
but a 300 MB log is too much to look through
w

witty-crayon-22786

11/02/2020, 7:41 PM
yea.
f

flat-zoo-31952

11/02/2020, 7:42 PM
trying verbosity 3
since changing that invalidates the cache, it makes it easier to reproduce this on command
w

witty-crayon-22786

11/02/2020, 7:42 PM
i’m wrapping up https://github.com/pantsbuild/pants/issues/10847 and can then take a look at this.
👍 1
@flat-zoo-31952: i think that the thing that would be most helpful in here would be a smoking gun “this log line doesn’t make sense: why are we re-downloading
$libx
here when we did
$y
minutes ago”
that or something that repros the issue… i can spend some time today hunting for one.
f

flat-zoo-31952

11/02/2020, 8:04 PM
yeah i'm looking for that, but i'm not seeing that in these logs...it seems to be using the cache...but still taking 4 min
w

witty-crayon-22786

11/02/2020, 8:05 PM
there are at least two caches within pex: one is for downloaded artifacts (source distributions); and another is for built wheels
f

flat-zoo-31952

11/02/2020, 8:05 PM
i understand, no pressure on you guys....this is hard to reproduce and analyze
i encounter this as mostly an annoying wtf moment in the mornings when i don't really have the time sit down and look at it
w

witty-crayon-22786

11/02/2020, 8:07 PM
totally.
f

flat-zoo-31952

11/02/2020, 8:09 PM
however it's still better then when i cache miss on the
RUN pip install -r requirements.txt
in dockerfiles, so i'll try to keep that in perspective 😁
w

witty-crayon-22786

11/02/2020, 8:09 PM
😅
f

flat-zoo-31952

11/02/2020, 8:16 PM
This took about 4 min...seems like it is using the cache for most items, but i guess i don't really understand why it takes so long
w

witty-crayon-22786

11/02/2020, 8:30 PM
@flat-zoo-31952: i don’t really have a great understanding of pex’s logging, but i’ll spend some time today to get that, and then see what sense i can make of that log.
but yea, superficially, the amount of time i’m seeing with an empty Pants process cache and a full Pex cache is surprising. so i’ll investigate that first.
@flat-zoo-31952: i’ve filed https://github.com/pantsbuild/pex/issues/1094 about this… there is a bunch of unexpected time there. will dig a bit deeper, since if the smaller example is slow, then it doesn’t surprise me that the larger one is
🙏🏻 1
we spend a lot of time in Pants to try to avoid invoking tools by caching them in the process cache, but i think that the folks we’ve been working so far must have more stable requirements than you do
f

flat-zoo-31952

11/03/2020, 12:38 AM
That makes sense...and the instability of my requirements is likely to remain perpetually. This is why I think having some kind of intermediate form of resolution (like lockfiles) makes sense. But I'm probably going to need to do a better job of elaborating my use cases
j

jolly-midnight-72759

11/03/2020, 2:12 AM
Just rechecked my configs. I had set up my
dev
index with an import of all the wheels and other packages from our company pypi. The root one had
Copy code
"mirror_url": "<https://pypi.org/simple/>",
        "mirror_web_url_fmt": "<https://pypi.org/project/{name}/>",
set up. It was the one that caches everything locally.
👍🏻 1
w

witty-crayon-22786

11/04/2020, 1:18 AM
i spent some more time looking at this, and it really seems like there is inherent overhead in the resolve re-running, even with 100% http cache hits for wheels.
at least some of that time is network time to re-list pypi: a native lockfile implementation might allow the listing to be skipped as long as the lockfile could be used to bypass the resolve entirely (which it should be able to if the inputs had not changed).
but if any requirement had actually changed (as seems to be the case in this thread), then the lockfile could not be used, and resolve performance would come back into play
j

jolly-midnight-72759

11/04/2020, 1:22 AM
./pants devpi start
?
w

witty-crayon-22786

11/04/2020, 1:28 AM
@jolly-midnight-72759: my hope is that we can fix any glaring performance issues here at a fundamental level such that people mostly don’t need to run proxies
👍🏽 1
@flat-zoo-31952: one thing i note from your log is that the
Cache-Control: max-age
is set to 0… which actually disables using that cache. is that intentional? the default in
2.0.0
is one hour
j

jolly-midnight-72759

11/04/2020, 1:30 AM
I thought you had identified a fundamental step that could not be avoided or worked on.
w

witty-crayon-22786

11/04/2020, 1:30 AM
haven’t given up yet, heh
👖 1
f

flat-zoo-31952

11/04/2020, 1:49 AM
Oh... I thought putting 0 meant to never expire the cache
What's the value for "please just use cache as much as you can"
w

witty-crayon-22786

11/04/2020, 2:09 AM
A "very large number", afaik. Would check back in on the TTL github issue, but if you look at pip's behavior when max_age==0, it just disables the cache.
Away from my computer, but can point to the line in pip when I get back.
f

flat-zoo-31952

11/04/2020, 2:14 AM
no need, i'll take your word for it
@flat-zoo-31952: one other question on this: are you seeing re-resolves on different binaries while running
package
or
run
? or is this during
test
?
f

flat-zoo-31952

11/06/2020, 12:27 AM
Both, I think...
w

witty-crayon-22786

11/06/2020, 1:43 AM
ok, thanks
i filed https://github.com/pantsbuild/pants/issues/11105 , which would significantly reduce the number of resolves that we do without sacrificing accuracy
(…and submitted it to the prioritization survey!)
👍🏻 1