Getting this message from time-to-time when runnin...
# general
f
Getting this message from time-to-time when running commands, even though my interpreter_search_paths is set to
/usr/bin/python3
which is definitely there... Anybody know what this means?
Copy code
08:28:54.05 [WARN] Completed: Find Python interpreter to bootstrap PEX - No bootstrap Python executable could be found from the option `interpreter_search_paths` in the `[python-setup]` scope. Will attempt to run PEXes directly.
e
Hrm, that's because we lie 😕 Here: https://www.pantsbuild.org/docs/reference-python-setup#section-interpreter-search-paths we say "You can specify absolute paths to interpreter binaries and/or to directories containing interpreter binaries.". That's only true for some uses, its not true for the use in finding a bootstrap Python. That use only works for tradition PATH (directory) entries.
It seems there are 3 remedies in the bootstrap search context. We'd need to test each entry in the (expanded) interpreter_search_paths and if the entry is a file: 1. Just use it. 2. Warn its not being used. 3. Use its parent dir as a PATH entry for the search Seems like 1 would make the existing doc not-a-lie, comport with how Pex handles this and generally meet expectations.
f
I guess part of the problem of PATH vs PYENV too is that... well I usually have PYENV on my path. In my use case my pyenv is kinda a lie too because I'm sharing a home setup between different containers I do actual development in
In looking into this I've also observed that it seems to be downloading pex very frequently... like not every time I run a command but maybe when I switch between goals? I figured with the downloads these files would stay in the append-only cache
Copy code
16:57:57.67 [DEBUG] requesting <class 'pants.backend.project_info.dependees.DependeesGoal'> to satisfy execution of `dependees` goal
16:57:57.67 [DEBUG] Launching 1 roots (poll=false).
16:58:06.46 [DEBUG] Completed: Downloading: DownloadFile(url='<https://github.com/pantsbuild/pex/releases/download/v2.1.51/pex>', expected_digest=FileDigest(fingerprint='b8d21a4d8db88c9a3c73d3b0c324c7a01c48c2a2d86e3952fc5673f5e5e464f... (37 characters truncated)
e
Yes, downloads should be cached - but not in the append only cache (
~/.cache/pants/named_caches/pex_root
), in the LMDB store (
~/.cache/pants/lmdb_store
). If you have evidence this is not the case it would be great if you could file an issue with your container switches.
The lmdb store uses mmap though, and that may not be flushed to disk when you expect - so I'm not sure how the in-mem bits transfer between containers - probably not at all without some customized container config.
f
Hmm yeah there is nothing but a barebones directory structure layed out in lmdb store. Looks nothing is getting saved there
🙀 1
As for container switches... I'm using Toolbox which decides most of the switches for me 😕, I can either give you the underlying
crun
command scraped from
ps
or I can give you the output of
podman inspect
. Let me know what you want and I'll open an issue.
h
"Nothing is getting saved in LMDB store" sounds very odd, and must be killing your performance. You don't have any unusual cache-related settings in
pants.toml
by any chance do you?
f
no, and I think I was mistaken...I there are several data and lock files there, and
Copy code
❯ du -sh ~/.cache/pants/lmdb_store/
57M	/home/jreed/.cache/pants/lmdb_store/
looks like something is there at least, but aren't there supposed to be files named after hashes? Isn't it some kind of CAS?
I'm probably jumping to conclusions based on incomplete information I have. If I go and poke around at the directories with an lmdb client it does look like there's stuff there, I just have no idea how this is all laid out or used. I don't really have any evidence that the downloads are not cached other than that I see the
Copy code
DownloadFile(url='<https://github.com/pantsbuild/pex/releases/download/v2.1.51/pex>', expected_digest=FileDigest(fingerprint='b8d21a4d8db88c9a3c73d3b0c324c7a01c48c2a2d86e3952fc5673f5e5e464f... (37 characters truncated)
message too often, with what appears to be a delay that suggests there actually is a download being performed. (In the interactive CLI it take 5-6 sec).
In my container it seems to happen whenever pantsd gets restarted
e
LMDB is opaque. The memory mapped files contain btree structured key / value store data. So you won't see the keys or values, just flat files (we shard so there should be more than one flat file).
f
Yeah it's pretty clear how opaque it is. What I'm not getting (that may be more relevant) is why containerization would affect the way it works. I've tried searching around to understand how containers affect mmap'ed files and I'm not getting a lot of relevant hits. What settings are you talking about that affect this?
e
But I'm pretty ignorant here. Its just highly likely that getting various kernels to see the same mmap is not easy or at least out of the box going to work and that's what you're trying to do here.
It would probably help debuggging to not be using a high-level tool and instead be using podman or crun directly to try to get container + host seeing same mmap realtime or else container + container.
I'll see if I can find time later today to mess around with this. It sounds like you're fully adopting the jessfraz way of life and that would be great for Pants to support.
👀 1
f
I'm trying to get it to work using the development env we use in our org. Which involves using a pet container for each fedora version we target in our prod system. I already am finding the need to break out of this higher-level tool a bit for other reasons, but if I want to get adoption I need to figure out how to make things work within it, at least initially.
I'll try the IPC mode stuff
e
Yeah, I just meant dropping down some layers for debugging.
👍🏻 1
f
but fwiw...
Copy code
❯ podman inspect aiven-fedora-33 --format '{{ .HostConfig.IpcMode }}'
host
h
There has been recent interest in having Pants execute processes inside a container
For example it would mean you can build deployable pexes on macos
f
Eventually I plan on finding a way to make pants run build steps inside (emphemeral) containers, potentially even committing the writable layers and preserving those as the results of the build rule
I'd love to discuss this kinda stuff at some point in the future, because I think it opens up a lot of potential, because you could start involving other package managers in the process like rpm, dep, brew, choco, whatever
but for this, i'm just trying to figure out how to make it run normalishly inside the pet containers we use as our dev env; and right now what i'm seeing is that a lot of caching doesn't seem to survive pantsd restarts inside my container...whether this is related to LMDB not committing or mmap not working right i don't know, but I'll keep looking
Is it correct to assume that LMDB gets used for native build rule caches and the named_caches get used for integrated tools?
e
LMDB is all the caching for any process run by Pants. Rule caching is only in-memory for the lifecycle of pantsd. And, yes, named_caches are for any caches external tools can use to speed their operations up. We have Pex store its PEX_ROOT there in pex_root.
🙏🏻 1
LMDB has a key formed from the hash of the args, env, and inputs to the process that maps to the output of that process.
So its the major nexus of useful caching in Pants. If it isn't working, you're in trouble. If named_caches isn't shared, you get slowdowns only when there is a process cache (LMDB) miss and some real work needs to be done by the invoked tool. I.E.: Pex will need to download more.
f
That's useful information. Could you point me at the code used to interact with LMDB? It would help me to debug to be able to construct a simpler failing example of this problem than just running pants and killing pantsd all the time.
When I try to debug this, I'll use https://pypi.org/project/lmdb/ to set up a simple db and see if I can get a host process and a container process to see the same data.
Then work backwards from there if I can get it working.
👍🏻 1
f
Hmm I'm pretty sure LMDB is working normally, and this whole thing may have been a goose chase on my part caused by performance differences of IO inside and outside the container...
Turns out I can shut off my network entirely and the
DownloadFile
rule seems to take the same time... it must just be doing the checksum or something. I can also run pants with a different
$XDG_CACHE_HOME
set and get it to populate megabytes into the lmdb_store in that cache. So I think this was all much ado about nothing in the end 😩