Hi all. I’m trying to understand if it is normal f...
# general
w
Hi all. I’m trying to understand if it is normal for pex to leave /tmp/ folders behind after execution. If there is a configuration that I can use to manage that behavior. And if that is not the expected behavior, where should I start troubleshooting this. If there is some documentation section that I should be reading for context, please feel free to point that out. Thanks for any help.
w
all tmp files should be cleaned up by the time
pantsd
exits: if you’re seeing otherwise, please definitely file an issue, ideally with a repro
w
Hi @witty-crayon-22786. Thanks for the reply. I think I didn’t define my environment well. The pex file is run on a slim OS in a container with just python installed. There is no
pantsd
running on the environment, just a script which executes the pex file built outside in a CI/CD. We decided to use a pex file because it makes bundling dependencies easier. We execute the pex file multiple times throughout the container’s lifecycle because it bundles a CLI tool which we execute periodically through a script running inside the container.
I’m going to read up on https://v1.pantsbuild.org/architecture_pantsd.html before I ask any further questions.
Okay. Thanks for the help so far. Greatly appreciate it! Follow-up question. Is
pantsd
the only tool responsible/capable of cleaning up the temporary files left behind by an
.pex
file when it unpacks itself in
/tmp
or is there another tool or configuration that can be used to make sure those files are deleted? I can always create a bash script to do this, but if there is a tool in pantsbuild arsenal for the job, I’d rather use it for this purpose. Once again, thanks for any help.
c
There's a way to build a PEX so that it's expandable to a standard venv, and then a way to tell it to expand itself to the same location each time. I think it's https://pex.readthedocs.io/en/latest/api/vars.html#PEX_VENV when running it, I think you might also need https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode execution_mode=venv. I'll check tomorrow. We use it for long-running processes that systemd will restart if it crashes.
h
FYI that v1 link is for an old, deprecated version. But I notice that the current docs don't seem to have much detail about pantsd 🤦‍♂️
It sounds like your issue is at pex runtime, unrelated to pants?
So pantsd is not relevant here
You can try building a venv-mode pex, as Daniel suggests
e
Pex should never leave anything behind in /tmp period, Pants completely aside. Can you either reveal the contents you think Pex is leaving behind and / or identify the Pants or Pex version?
w
I’m about to deploy a change to our script which deletes the temporary files as follows:
Copy code
find /tmp -name 'PEX-INFO' | cut -d '/' -f 1-3 | xargs rm -r;
… after the pex execution. It should suffice for now. But I might test it with
dirname
instead of the
cut
. Probably more robust approach.
e
@wooden-baker-63668 can you reveal the /tmp contents? Like I said, Pex does not (knowingly) leak tmp files ever.
Also, please provide the Pants or Pex or both versions, whatever facts you have.
w
@enough-analyst-54434 will do. Most likely tomorrow since I’m about to start travelling for the next few hours. Can I get that information from the PEX-INFO file? I warn you that it is likely an old version because this pex was built a few months ago.
e
If you can share the PEX-INFO file, that includes the version of Pex used to create the PEX. The fact the PEX-INFO is leaked implicates Pants as the leaker, although that's still not 100% clear; so also revealing the Pants version would be very helpful. Favor providing too many details; not too few.
w
@enough-analyst-54434 Thanks again for the help. The
PEX-INFO
includes the following (I’ve removed the
bootstrap_hash
,
code_hash
and
distributions
, please let me know you need them).
Copy code
{
  "build_properties": {
    "pex_version": "2.1.111"
  },
  "emit_warnings": true,
  "entry_point": "kapitan",
  "ignore_errors": false,
  "includes_tools": false,
  "inherit_path": "false",
  "inject_args": [],
  "inject_env": {},
  "interpreter_constraints": [],
  "pex_hash": "3f81ccc94c8c170878d2be24b312593d8b4293b8",
  "pex_path": "",
  "pex_paths": [],
  "requirements": [
    "grafanalib==0.5.12",
    "httplib2==0.19.1",
    "kapitan==0.30.0"
  ],
  "strip_pex_env": true,
  "venv": false,
  "venv_bin_path": "false",
  "venv_copies": false,
  "venv_site_packages_copies": false
}
As I mentioned above, and re-iterate now but with more detail, the pex file (named
kapitan
) is executed directly from a script that is similar to;
Copy code
for f in *.yml; do (...); done | kapitan refs --reveal -f -;
this script is executed with
/bin/sh -c
, the only process running on the container. This leads me to believe that maybe pants is not involved at all with any clean-up process because it is not running on the container, unless
kapitan
executes it somehow.
e
What is "pex_root" in that PEX-INFO file? Also, do you have a PEX_ROOT environment variable set in the container / script?
It may be missing, which is most likely and that's why its not in your output above. If so, it would then be good to know what $HOME is since the default PEX_ROOT is $HOME/.pex
So, @wooden-baker-63668 , in summary, can you provide the values of PEX_ROOT and HOME env vars in the context the
kapitan
PEX runs in?
Actually, @wooden-baker-63668 you revelaed enough in that PEX-INFO that I can definitively say I do not repro:
Copy code
$ pex pex==2.1.111 -cpex -- --resolver-version pip-2020-resolver grafanalib==0.5.12 httplib2==0.19.1 kapitan==0.30.0 -o kapitan -c kapitan
$ sudo rm -rf /tmp/*
[sudo] password for jsirois:
$ ls -lrt /tmp/
total 0
$ echo $PEX_ROOT

$ echo $HOME
/home/jsirois
$ ./kapitan
usage: kapitan [-h] [--version] {eval,e,compile,c,inventory,i,searchvar,sv,secrets,s,refs,r,lint,l,init,validate,v} ...

Generic templated configuration management for Kubernetes, Terraform and other things

positional arguments:
  {eval,e,compile,c,inventory,i,searchvar,sv,secrets,s,refs,r,lint,l,init,validate,v}
                        commands
    eval (e)            evaluate jsonnet file
    compile (c)         compile targets
    inventory (i)       show inventory
    searchvar (sv)      show all inventory files where var is declared
    secrets (s)         (DEPRECATED) please use refs
    refs (r)            manage refs
    lint (l)            linter for inventory and refs
    init (i)            initialize a directory with the recommended kapitan project skeleton.
    validate (v)        validates the compile output against schemas as specified in inventory

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
$ ls -lrt /tmp/
total 0
So, Pex is not leaking anything in
/tmp
. The only way it could is if you exported
PEX_ROOT=/tmp
or
HOME
was in
/tmp
.
w
@enough-analyst-54434 I’ll check if
PEX_ROOT
or
HOME
are present in the environment and if they are set to
/tmp
. Thanks for the help so far.
@enough-analyst-54434 we tentatively solved the issue with
mkdir -p /tmp/.pex
and
PEX_ROOT='/tmp/.pex/' kapitan ...
Thanks for the help
e
I'm glad you feel you have things solved. That said, what happened to to the leak? Were you setting PEX_ROOT or HOME to /tmp previously or not? Where do the files go now? to /tmp/XXX or just /tmp/.pex/XXX?
Basically, it seems like there has been 0 resolution of the actual prolem you were seeing. It's still mysterious.
w
PEX_ROOT was unset and HOME was set to a directory that was not writable to pex. Which led pex to create a temporary PEX_ROOT in a /tmp/tmpXXXXX folder. The logs pretty much said it all, but they were buried amongst a lot of other things.
They are now going to
/tmp/.pex/XXX
e
Aha, ok - thank you for that detail. Great. As a provider of support it is nice to have a loop closed and no voodoo left.
w
Let me see if I can find the error message again.
e
No need. I wrote that code.
Totally makes sense.
w
eheheheh
Thanks for your work and help! Appreciate it!