I think i'm running into more issues with pex http...
# pex
f
I think i'm running into more issues with pex http cache behavior. It seems the first time I work on a target in a day or two, some kind of cache miss is triggered, and I'm back to 5-10 min of downloading again (which I can see via the network indicators). I have
[python-setup].resolver_http_cache_ttl = 0
set. This is hard to reproduce because it only happens once or so in a given day, usually when i'm not expecting it. From previous logs, I suspect this is the remote TTL from the server. I'm not sure whether this is a vendored-pip thing, or a pex thing, but I really really don't think the HTTP cache TTL should apply to immutable resources like this. This is important to me because I'm often working with a slowish connection and dealing with
opencv
and
torch
which are just enormous packages that i really want to avoid downloading as much as possible.
j
As a work around, you could run a local pypi cache using
devpi
, https://www.devpi.net/. I run a local server with an index that passes through to our pypi server. This caches locally all the packages and only downloads to my mac when a new version is added to the server.
I run it in a venv and configured it to keep the cached files in a persistent location.
šŸ‘€ 1
f
i've tried this but it doesn't seem to work for me
i mean...i've tried the docker image version of this, it doesn't seem to actually cache the files
j
Later today, I'll get you more details on my setup. I basically followed the Quickstart tutorial. In addition to the default root index, I set up a "dev" index with our pypi server as the base. I then pointed my pip to the my local dev index. The whole server was pointed to a local directory on my laptop's hard drive.
f
Cool, this seems like a good workaround!
I honestly think pex may have inherited this behavior from pip, so asking to "fix" it is more of a feature request
I think I just hit it much more often with pants/pex because i don't have a stable venv to essentially cache all my dependencies
w
is this in a CI environment?
the PEX and pip caches go into
Copy code
~/.cache/pants/named_caches
ā€¦ so would see if maybe theyā€™re being cleared?
f
not in a CI environment, on my workstation
caches are not being cleared
w
hm
the first level of caching here is that Pants should not be invoking pex at all if your requirements havenā€™t changed
are you seeing what looks like appropriate behavior at that level?
f
i think so...it's hard to say because my requirements change fairly frequently
w
ok, got it.
f
i suspect pants is getting it right based on what i've seen, except that sometimes it seems to resolve
constraints.txt
differently for different goals, if that makes sense?
w
@flat-zoo-31952: itā€™s not tremendously practical, but you can set
--pex-verbosity
to enable logging in general
f
yeah i can do that until i learn more
w
@flat-zoo-31952: re: different goals: it needs to do an additional resolve to get the ā€œtoolā€ for the goal, but that shouldnā€™t invalidate the underlying resolve
f
i can look into this, i can also look back through some of the logs i gathered when digging through this before
resolver_http_cache_ttl
was added
well i generated a 300 MB log file with a single attempt to resolve constraints.txt (which i have no idea why this was even triggered...my constraints.txt hasn't changed in a week or more
w
the presence of the option changes the cache key unfortunately.
so it will re-run pex.
f
it appears that verbosity 5 was too much though šŸ˜„
w
but assuming that
~/.cache/pants/named_caches
contains relevant content, it should re-run quickly
if it doesnā€™t re-run quickly, then that might demonstrate a/the bug
f
yeah it took ~5 min
but a 300 MB log is too much to look through
w
yea.
f
trying verbosity 3
since changing that invalidates the cache, it makes it easier to reproduce this on command
w
iā€™m wrapping up https://github.com/pantsbuild/pants/issues/10847 and can then take a look at this.
šŸ‘ 1
@flat-zoo-31952: i think that the thing that would be most helpful in here would be a smoking gun ā€œthis log line doesnā€™t make sense: why are we re-downloading
$libx
here when we did
$y
minutes agoā€
that or something that repros the issueā€¦ i can spend some time today hunting for one.
f
yeah i'm looking for that, but i'm not seeing that in these logs...it seems to be using the cache...but still taking 4 min
w
there are at least two caches within pex: one is for downloaded artifacts (source distributions); and another is for built wheels
f
i understand, no pressure on you guys....this is hard to reproduce and analyze
i encounter this as mostly an annoying wtf moment in the mornings when i don't really have the time sit down and look at it
w
totally.
f
however it's still better then when i cache miss on the
RUN pip install -r requirements.txt
in dockerfiles, so i'll try to keep that in perspective šŸ˜
w
šŸ˜…
f
This took about 4 min...seems like it is using the cache for most items, but i guess i don't really understand why it takes so long
w
@flat-zoo-31952: i donā€™t really have a great understanding of pexā€™s logging, but iā€™ll spend some time today to get that, and then see what sense i can make of that log.
but yea, superficially, the amount of time iā€™m seeing with an empty Pants process cache and a full Pex cache is surprising. so iā€™ll investigate that first.
@flat-zoo-31952: iā€™ve filed https://github.com/pantsbuild/pex/issues/1094 about thisā€¦ there is a bunch of unexpected time there. will dig a bit deeper, since if the smaller example is slow, then it doesnā€™t surprise me that the larger one is
šŸ™šŸ» 1
we spend a lot of time in Pants to try to avoid invoking tools by caching them in the process cache, but i think that the folks weā€™ve been working so far must have more stable requirements than you do
f
That makes sense...and the instability of my requirements is likely to remain perpetually. This is why I think having some kind of intermediate form of resolution (like lockfiles) makes sense. But I'm probably going to need to do a better job of elaborating my use cases
j
Just rechecked my configs. I had set up my
dev
index with an import of all the wheels and other packages from our company pypi. The root one had
Copy code
"mirror_url": "<https://pypi.org/simple/>",
        "mirror_web_url_fmt": "<https://pypi.org/project/{name}/>",
set up. It was the one that caches everything locally.
šŸ‘šŸ» 1
w
i spent some more time looking at this, and it really seems like there is inherent overhead in the resolve re-running, even with 100% http cache hits for wheels.
at least some of that time is network time to re-list pypi: a native lockfile implementation might allow the listing to be skipped as long as the lockfile could be used to bypass the resolve entirely (which it should be able to if the inputs had not changed).
but if any requirement had actually changed (as seems to be the case in this thread), then the lockfile could not be used, and resolve performance would come back into play
j
./pants devpi start
?
w
@jolly-midnight-72759: my hope is that we can fix any glaring performance issues here at a fundamental level such that people mostly donā€™t need to run proxies
šŸ‘šŸ½ 1
@flat-zoo-31952: one thing i note from your log is that the
Cache-Control: max-age
is set to 0ā€¦ which actually disables using that cache. is that intentional? the default in
2.0.0
is one hour
j
I thought you had identified a fundamental step that could not be avoided or worked on.
w
havenā€™t given up yet, heh
šŸ‘– 1
f
Oh... I thought putting 0 meant to never expire the cache
What's the value for "please just use cache as much as you can"
w
A "very large number", afaik. Would check back in on the TTL github issue, but if you look at pip's behavior when max_age==0, it just disables the cache.
Away from my computer, but can point to the line in pip when I get back.
f
no need, i'll take your word for it
@flat-zoo-31952: one other question on this: are you seeing re-resolves on different binaries while running
package
or
run
? or is this during
test
?
f
Both, I think...
w
ok, thanks
i filed https://github.com/pantsbuild/pants/issues/11105 , which would significantly reduce the number of resolves that we do without sacrificing accuracy
(ā€¦and submitted it to the prioritization survey!)
šŸ‘šŸ» 1