Any tips on receiving an IO error when trying to w...
# general
b
Any tips on receiving an IO error when trying to wrap up
requirements.txt
for a large set of deps? Currently seeing
"Error storing Digest { hash: Fingerprint<012ed7213b69805f0e19542c9ec2fc98640d356b25739ec6c687bc20abcf9ab6>, size_bytes: 2392030244 }: Input/output error"
w
the store is
~/.cache/pants/lmdb_store
by default: is there anything interesting about how that’s mounted in this case?
b
If there is, it wasn't anything I've configured 😅
w
is docker involved?
b
I suspect this might be due to size? If I read correctly it'd be 2.4 Gigs
Nope, no docker
w
the max file size should be 16 GB by default
and your home directory has plenty of space available?
df -h
doesn’t show anything suspicious?
also, how much space is the store currently using?
(there are some settings that we can adjust, but want to rule out the obvious things first)
🙌 1
b
Copy code
joshuacannon@CEPHANDRIUS:~/work/PyWeekly$ df -h ~/
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p2  1.9T  231G  1.6T  13% /
Copy code
joshuacannon@CEPHANDRIUS:~/work/PyWeekly$ du -sh ~/.cache/pants
54G	/home/joshuacannon/.cache/pants
I didn't say specifically, but this seemingly is raised from
Resolving requirements.txt
w
hm… which version of 2.7.x are you using?
b
pants_version = "2.7.0"
w
@bitter-ability-32190: mind trying
2.7.1rc1
? it moves to decomposed PEXes
b
Sure thing!
...for my own curiosity, what's a decomposed PEX? 🙂
w
are you familiar with the distinction between eggs and wheels?
b
well enough? 😅
w
a decomposed PEX is essentially a series of zipped installed wheels, rather than a single zipped set of installed wheels
b
OH hell yeah. That answers a future question I had, if I change
requirements.txt
do I build a full PEX again. Sounds like no 💪
Very excited for that feature. Verryyyy
w
OH hell yeah. That answers a future question I had, if I change 
requirements.txt
 do I build a full PEX again. Sounds like no 💪
you will, but it’s mostly hardlinking things in and out of sandboxes
b
Right, I'm not paying the cost of N copies anymore. It's now N links?
w
yea, something like that. PEX has caches of installed wheels in
~/.cache/pants/named_caches/pex_root
👍 1
b
That didn't crash-and-burn 🎉
I'll stick to that version, since I'm still doing eval
w
yea: we’ll stabilize that soon. thanks!
https://github.com/pantsbuild/pex/pull/1438 has more info on this change.
(“spread” / “packed” ~= “decomposed”)
b
I'll need to scratch my head much more at PEXs 😅
Hmm, the previous success might've been a red herring. On using
2.7.1
and nuking the cache, I'm seeing this again
Copy code
"Error storing Digest { hash: Fingerprint<52602e41f8f94597c200fc9178c711a9d2e735e4412508260b47ed28895d3c78>, size_bytes: 2389577269 }: Input/output error"
(red herring)
In the meantime, I'll turn back off
requirement_constraints
w
oh, hm… what was the red herring? is the i/o error back even in 2.7.1 ?
I’m seeing this again
for a different process though, presumably?
b
The red herring was a Zip error from Python2.7 which popped up because I ran the command outside of the virtualenv.
So, yeah seeing the I/O issue on 2.7.1. Same process/constraints (I nuked the cache yesterday). So I think what happened is I got lucky and the "Resolve requirements.txt" step succeeded due to... reasons? So it seemed the issue was gone when really it is just flaky.
And then I nuked the cache, making the bug have a vector to re-appear on stage
w
hm. yea, something is fishy. can i have a bit more context on which process this was?
Resolving requirements.txt
in 2.7.1 should not be creating a monolithic file, so there would need to be a single installed wheel that was that large…?
b
Oh, good point. Might be one of those huge CUDA-specific binaries? Whatcha need?
w
also, if you use
--no-process-execution-local-cleanup
and inspect the sandbox for the failing process, would you be able to share
ll -R
privately?
b
I'm in a
ve
with everything already installed, so this seems relevant:
Copy code
(ve) joshuacannon@CEPHANDRIUS:~/work/techlabs$ du -sh -t 1G ve/lib/python3.8/site-packages/*
1.4G    ve/lib/python3.8/site-packages/torch
Largest package is "only" 1.4G
I can re-run once I'm finished building this docker image 😄
1447.03s      Building docker image tts_service_img:latest
I don't want to throw away my 1400seconds of work
w
ok, i think i see what is up. we only use the packed layout (by default) for commands that don’t export an actual PEX file
so which command you are using matters here: this must not be `test`… is it
package
maybe?
b
It is. Although previously I believe
lint
was involved
w
lint
should be using this layout: if it isn’t, it’s a bug… let’s check that out later.
🙌 1
b
Today was my first litmus test of
package
, so it would've been some command previously
w
but for now: i think that we either need to have your
pex_binary
targets use the packed layout, and/or dive deeper on the i/o error. let me check both of those options out.
@bitter-ability-32190: one more question: which fraction of your requirements.txt do you think this binary uses?
(ballpark)
b
~30 deps out of ~250 total?
w
ok, yea. the relevance is that there is definitely one bug here: https://github.com/pantsbuild/pants/issues/13398 … we should still be avoiding creating the monolithic PEX in this case.
i’ll tackle that one now.
b
If I was to issue
lint
, and then
package
, would it know to use the existing
packed
layout for
requirements.pex
Local testing would suggest, yes.
13:09:27.14 [INFO] Completed: Resolving constraints.txt
Or at least
lint
succeeds. Now to go back to my
package
w
If I was to issue 
lint
, and then 
package
, would it know to use the existing 
packed
 layout for 
requirements.pex
in general, it would: yes. but that’s why https://github.com/pantsbuild/pants/issues/13398 is problematic: those two codepaths are requesting different repository.pex layouts. so they can’t share.
b
Ah damn
w
while we cherrypick that fix for you, we might be able to tackle the i/o error. it will require re-creating the
~/.cache/pants/lmdb_store
directory, but you can change https://www.pantsbuild.org/docs/reference-global#section-local-store-shard-count to affect file size limits
🙌 1
b
I'll try that
Which flag was it? Specifically
local_store_shard_count
?
w
yes: lowering that to 4 or so
and clearing the
.cache/pants/lmdb_store
when you do.
as it says: the faster your disks are, the less likely you are to need sharding (on the other hand, very large core counts…? something to experiment more with later)
b
😞
"Error storing Digest { hash: Fingerprint<ec87219a0b5013caf483fb1192fa6ba5e3ca8dd4c491031ec9c7d992adc808c9>, size_bytes: 2389577271 }: Input/output error"
With
local_store_shard_count = 4
in
[GLOBAL]
w
shoot. ok, i’ll open a ticket for that one. in the meantime, the fix for https://github.com/pantsbuild/pants/issues/13398 should land later today, and we can get it picked to 2.7.x. that should resolve (/ work around) this issue unless you have individual packages that are as large as the repository.pex (and assuming that this is actually related to the size).
b
I'm giving the PR a whirl, stay tuned
Also 🙌
It works! 🎉
🎉 1
w
glad to hear it! working on getting that picked. also opened https://github.com/pantsbuild/pants/issues/13401
ok, it’s bound for
2.7.2rc0
: https://github.com/pantsbuild/pants/commits/2.7.x … i’m getting on a plane, so someone else will be doing the release.
also : consider starting a new thread / github issue about the slow docker run: i’m guessing @curved-television-6568 would be interested
b
Slow docker run: I suspect it was "hanging" due to this issue: https://stackoverflow.com/questions/40160592/dockerfile-how-to-pass-an-answer-to-a-prompt-post-apt-get-install Essentially
sudo apt install -y jackd2
still prompts for input 🙄
Setting
ENV DEBIAN_FRONTEND noninteractive
in the Dockerfile worked. Annoying UX, but not y'all's fault
👍 1
w
mmm, good to know. i do think that we could/should do something there. the “Long running tasks output” could emit the last line from the process or something. i’ve added that as an idea to https://github.com/pantsbuild/pants/issues/11965#issuecomment-956403050
🙌 2