Anyone know what might cause Pants to start crashi...
# general
n
Anyone know what might cause Pants to start crashing from segfaults when doing any operations involving dependency resolution (packaging, running, dependencies, etc.)? Seems to have started happening randomly. Restarted host / tried different host / reinstalled pants / wiped caches and still keeps happening. Suspecting might be something I did with the universe of requirements / resolve, but can't think what. This is what gdb says of the coredump: Core was generated by `pantsd [/d/d1/user/taymarti/casper/codetree/codetree/src] '. Program terminated with signal 11, Segmentation fault. #0 0x00007ff4c5d4f928 in ?? ()
h
Oof, that is pretty bad. A reliable repro would be really helpful. Does this happen on other machines?
n
Will try to cut up the repo tomorrow and see if I can repro w/ a shareable example. Had also noticed the file watcher limit was breaching a lot recently (put a request to admin to increase that kernel parameter), so might be a clue. Another observation is that the “damage” seems cumulative; that is, on a fresh install it works to resolve the lock files and a few other goals, but once the first memory fault occurs, nothing that requires a dependency resolution works again. And yes, believe it was same behavior on another host I tried, but will confirm tomorrow (hosts are basically identical RHEL hosts though).
Reproduced the issue in a mock repo, and it would also explain the randomness in it appearing. I hope this is enough info to repro but lmk if you have questions. The issue is that there is a folder
./admin/tools
containing a
BUILD
with
shell_sources
and a symlink
./admin/tools/tools -> .
. It also doesn't matter what the name of the link is, crash is reproduced if its target is
.
. Also converting the target to absolute path actually makes Pants throw an absolute symlink error. Anyway, this is all relatively moot since the symlink is clearly pointless and made by mistake, but just FYI if you think that the circularity leading to blowup might be an issue elsewhere.
h
Oh, hm, I'm not sure our snapshotting handles symlinks
I know there is some issue there, @witty-crayon-22786 (who is out today) will know more
Link to the mock repo?
n
I don't have access to Github/Slack at work so hard to make one I can share. I think it's as simple as creating a symlink that points to the parent dir. I think this simple script should be enough to reproduce (if not, I'll make one on Github when I get home).
Copy code
mkdir 3rdparty
echo "pandas~=1.0" >3rdparty/requirements.in
echo "python_requirements(source='<http://requirements.in|requirements.in>', resolve='python-default')" >./3rdparty/BUILD
mkdir tools
echo -e "#!/bin/bash\necho $HOME" >./tools/sometool.sh
echo "shell_sources(sources=['*.sh'])" >./tools/BUILD
ln -s . ./tools/badsymlink
pants generate-lockfiles --resolve=python-default
There's a requirement of
jupyter-core
in one of my resolves and it has
"pywin32>=1.0; sys_platform == \"win32\" and platform_python_implementation != \"PyPy\"",
listed in its
required_dists
field. But then
pex3 lock export --platform current $path_to_lockfile.json
on a Linux system exports
pywin32
anyway. Is that expected behavior? I can see why it would show up in Pant's generalized lockfile (from which actual locks are just subsets as required to build stuff), but not Pex's given a specific platform option..
Copy code
(3.7.5) [taymarti.ivapp1366932]> pex3 lock export --platform linux_x86_64-cp-37-cp37m 3rdparty/python/resolves/release/lockfile.json | grep pywin
pywin32==304 \
n
Thx John. Also sorry didn't mean for this question about distros to be on this thread! What I get for using mobile slack. :D
w
snapshotting does handle symlinks, but i could totally imagine an oddly shaped one causing an issue: i’ll see if i can repro something with a
child -> .
symlink, and follow up for more details otherwise
…indeed. that causes a crash, heh. thanks for the report!
n
I thought Rust couldn't have memory corruption errors! :D
w
it’s likely a stack overflow
🙌 1
yea, stack overflow. surprisingly, it doesn’t trigger for
ln -s .. example
, so this will be fun!
thanks again for the report.
(Pants needs to be symlink aware in order to accurately watch filesystem paths, since otherwise the destination of a symlink changing would not invalidate the symlink itself)