https://pantsbuild.org/ logo
h

hundreds-father-404

03/27/2019, 3:57 PM
v2
@rule
resolution appears to be causing a serious startup performance hit at import time. Any thoughts appreciated on https://github.com/pantsbuild/pants/issues/7442
a

average-vr-56795

03/27/2019, 4:34 PM
Wow, that’s impressive and interesting and scary all at once
❗ 1
a

aloof-angle-91616

03/27/2019, 5:14 PM
commented, "rule resolution" doesn't happen at import time and i believe this is an
ast
issue
@hundreds-father-404 if i'm seeing
snakeviz: error: the file /Users/dmcclanahan/tools/pants/whole.prof is not a valid profile
and/or
ValueError: unknown record type in log file
from
gprof2dot
is it immediately clear what i'm doing wrong? i tried switching
python
on my path to point to python 3 but that didn't change the output
if not i'll figure it out
w

witty-crayon-22786

03/27/2019, 6:05 PM
wrong python, yea.
h

hundreds-father-404

03/27/2019, 6:06 PM
Apply the diff at the top of issue, and then make sure you
pip3 install tuna
rather than
pip
w

witty-crayon-22786

03/27/2019, 6:06 PM
i have two
venvs
with the different pythons for snakeviz
a

aloof-angle-91616

03/27/2019, 6:06 PM
i should just use pex
w

witty-crayon-22786

03/27/2019, 6:06 PM
or that, yea.
a

aloof-angle-91616

03/27/2019, 6:06 PM
and eric i'm trying to use the pex profiler to wrap the import line specifically to get a flamegraph for that call
h

hundreds-father-404

03/27/2019, 6:07 PM
Ah, that I have no idea how to do. Profiling in general is completely new to me as of last night hah
a

aloof-angle-91616

03/27/2019, 6:07 PM
i just remember from trying to investigate the globs matching stuff when i was stuck in bed last year
pip3 install tuna
worked, pex was yelling about encodings
πŸ‘ 1
w

witty-crayon-22786

03/27/2019, 6:09 PM
i do roughly:
virtualenv -p /usr/local/bin/python3 ~/venvs/snakeviz-py3
, then activate and
pip install snakeviz
. then, without the venv activated, you can do
~/venvs/snakeviz-py3/bin/snakeviz $prof
so it's a one-time thing to activate/deactivate
(via
source ~/venvs/snakeviz-py3/bin/activate
, and exiting the shell, respectively)
would recommend against actually pip installing things on your machine
πŸ‘ 1
a

aloof-angle-91616

03/27/2019, 6:13 PM
i do not understand the output for some reason as it shows that import taking "30 seconds" when the entire run was 7 seconds
but the deepest call stack does go into the yacc parser
false alarm, that's in the py zipkin import (and that says it's 0.2%)
e

enough-analyst-54434

03/27/2019, 6:15 PM
Did folks see my issue update - PYTHONPROFILEIMPORTTIME=1 is magic
πŸ”₯ 2
a

aloof-angle-91616

03/27/2019, 6:15 PM
i will now
e

enough-analyst-54434

03/27/2019, 6:15 PM
I think e have enough info to implicate @rule
πŸ‘ 1
a

aloof-angle-91616

03/27/2019, 6:16 PM
i would definitely think that is the case since we do nontrivial ast nonsense at import time, but i would strongly suspect the performance difference is due to some specific operation we're doing there. if we need to lazily traverse the ast we can do that if that is the cause of the issue, but if the issue is that
ast.parse()
is slower that would be more annoying
w

witty-crayon-22786

03/27/2019, 6:18 PM
Right. They've always been eagerly evaluated for that parsing/validation
Could probably microbenchmark it
a

aloof-angle-91616

03/27/2019, 6:18 PM
also, the traversal can be optimized, i explicitly didn't do that when i added the
yield Get(...)
check and we can prune the branches a lot
the vast majority of the runtime is in
inspect.getmodule
and other methods which i think means we can optimize this if need be (and is a great place to look for a py3 perf difference) -- i will post a profile after this lunch
Screen Shot 2019-03-27 at 11.48.41 (2).png
Screen Shot 2019-03-27 at 11.47.01 (2).png
this is just with
Copy code
diff --git a/src/python/pants/bin/pants_loader.py b/src/python/pants/bin/pants_loader.py
index 91a4ca1..7510e6e 100644
--- a/src/python/pants/bin/pants_loader.py
+++ b/src/python/pants/bin/pants_loader.py
@@ -66,7 +66,7 @@ class PantsLoader(object):
         Fix it by setting the LC_* and LANG environment settings. Example:
           LC_ALL=en_US.UTF-8
           LANG=en_US.UTF-8
-        Or, bypass it by setting the below environment variable.
+        Or, bypass it by setting the below environment variable.
           {}=1
         Note: we cannot guarantee consistent behavior with this bypass enabled.
         """.format(encoding, cls.ENCODING_IGNORE_ENV_VAR)
@@ -101,7 +101,9 @@ class PantsLoader(object):
   def load_and_execute(entrypoint):
     assert ':' in entrypoint, 'ERROR: entrypoint must be of the form `module.path:callable`'
     module_path, func_name = entrypoint.split(':', 1)
-    module = importlib.import_module(module_path)
+    from pants.util.contextutil import maybe_profiled
+    with maybe_profiled('wow.prof'):
+      module = importlib.import_module(module_path)
     entrypoint_main = getattr(module, func_name)
     assert callable(entrypoint_main), 'ERROR: entrypoint `{}` is not callable'.format(entrypoint)
     entrypoint_main()
forget the first part of the diff i am rushing
inspect.stack()
might not be necessary
66% of runtime during import is in the
@rule
decorator (2.06 seconds in my run)
πŸ‘ 1
Screen Shot 2019-03-27 at 14.38.53 (2).png
and
inspect.stack()
seems to take all the runtime, it calls tons of other things, but calling
inspect.getsource()
from the decorator itself is only 0.87% (and we suspected that might be culprit). it might be cached by python since
inspect.stack()
is the first
inspect
method used in that decorator
e

enough-analyst-54434

03/27/2019, 9:41 PM
In that case - dispensing with the stack and just ast-ing file may be a big win
πŸ‘ 2
a

aloof-angle-91616

03/27/2019, 9:41 PM
you have dispensed with maybe 30 minutes of searching, working on it
e

enough-analyst-54434

03/27/2019, 9:41 PM
Thanks for bringing this one home Danny.
❀️ 1
a

aloof-angle-91616

03/27/2019, 9:42 PM
thank me once my backend doesn't slow the rest of the thing down
removing
inspect.stack()
shaved 3 seconds off and i cannot find where rule parsing contributes to import time in the profile i have set up of that import line
seeing if this passes tests now but we don't need the stack here and i think this solution is more correct anyway
❀️ 1
e

enough-analyst-54434

03/27/2019, 10:31 PM
Excllent
a

aloof-angle-91616

03/27/2019, 10:34 PM
everything mysteriously seems to work except the error message which is expected and i am fixing
πŸ’― 1
πŸ”₯ 1
h

hundreds-father-404

03/27/2019, 10:35 PM
Does this also improve performance for
./pants2 -V
?
a

aloof-angle-91616

03/27/2019, 10:35 PM
i will include that number in the PR, i suspect so
πŸ‘ 1
getting the whole stack seems like an expensive operation in any language
h

hundreds-father-404

03/27/2019, 10:36 PM
Cool. My impression was this hurt us with both versions, but even more so in Py3 for some unknown reason
a

aloof-angle-91616

03/27/2019, 10:50 PM
there is a negligible performance increase in py2
πŸ‘ 1
however this change also brings py3 performance up to py2
❀️ 1
so
inspect.stack()
is like very very slow in py3 like this is p clear at least with our access pattern (which wasn't too weird)
πŸ‘ 1
probably worth a python bug unless it's known
h

hundreds-father-404

03/27/2019, 10:52 PM
That’d be a very cool open source contribution @aloof-angle-91616! I’d be happy to help you edit a writeup if you choose to do so
a

aloof-angle-91616

03/27/2019, 10:52 PM
i would love that
going to put it as a TODO in the PR description then make it
h

hundreds-father-404

03/27/2019, 10:58 PM
I wonder if the issue is related to the changed return type in 3.5? https://docs.python.org/3/library/inspect.html#inspect.stack
e

enough-analyst-54434

03/27/2019, 10:58 PM
Copy code
$ for py in python{2.7,3.{4,5,6,7}}; do echo -n "$py: " && $py -c "import timeit; print(timeit.timeit('import inspect; inspect.stack()', number=10000))"; done
python2.7: 2.63740515709
python3.4: 3.8067204039980425
python3.5: 4.471806422996451
python3.6: 4.527370049007004
python3.7: 3.3335486440046225
πŸ‘ 1
w

witty-crayon-22786

03/27/2019, 10:59 PM
probably proportional to depth
e

enough-analyst-54434

03/27/2019, 10:59 PM
Yup
w

witty-crayon-22786

03/27/2019, 10:59 PM
proportional/exponential, perhaps
a

aloof-angle-91616

03/27/2019, 11:46 PM
i added this to the PR description but there is an answer here that may explain the issue: https://stackoverflow.com/a/17407257/2518889
πŸ‘ 1