hi Pants, I'm trying to load a pex file into my sa...
# general
a
hi Pants, I'm trying to load a pex file into my sagemaker notebook but hit some error. I'm downloading the pex from s3 to local, which succeeds, but then
%pex_load tmp/ml_training_env.pex
fails. I put the code and output in the thread.
Copy code
%ls tmp
%load_ext pants_jupyter_plugin
%pex_load tmp/ml_training_env.pex
the output is:
Copy code
ml_training_env.pex
The pants_jupyter_plugin extension is already loaded. To reload it, use:
  %reload_ext pants_jupyter_plugin
HTML(value='<style>.nb-console-output-tCNbr { background-color: black;} .nb-console-output-tCNbr pre { color: …
Accordion(children=(Output(layout=Layout(height='300px', overflow_y='scroll'), outputs=({'output_type': 'displ…
Scrubbing sys.path and sys.modules in preparation for pex bootstrap
sys.path contains 7 items, sys.modules contains 1249 keys
sys.path now contains 7 items, sys.modules now contains 1249 keys
---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pants_jupyter_plugin/plugin.py in _accordion_widget(self, title, height, collapsed)
    147         # Capture the output context.
    148         with outputter:
--> 149             yield expand, collapse, set_output_glyph
    150 
    151     def _stream_binary_build_with_output(

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pants_jupyter_plugin/plugin.py in _bootstrap_pex(self, pex_path)
    295 
    296                     # Bootstrap pex.
--> 297                     for path in self._pex.mount(pex_path):
    298                         self._display_line(f"added sys.path entry {path}\n")
    299             except Exception:

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pants_jupyter_plugin/pex.py in mount(self, pex_to_mount)
    129 
    130                 selected_interpreter = json.loads(
--> 131                     run_pex_tool(args=["interpreter", "-v"], stdout=subprocess.PIPE).decode()
    132                 )["path"]
    133                 if not current_interpreter.samefile(selected_interpreter):

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pants_jupyter_plugin/pex.py in run_pex_tool(args, **subprocess_args)
    123                             env=env.create(PEX_INTERPRETER=1, PEX_PYTHON_PATH=sys.executable),
    124                             check=True,
--> 125                             **subprocess_args,
    126                         ).stdout
    127                         or b""

~/anaconda3/envs/pytorch_p36/lib/python3.6/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
    436         if check and retcode:
    437             raise CalledProcessError(retcode, process.args,
--> 438                                      output=stdout, stderr=stderr)
    439     return CompletedProcess(process.args, retcode, stdout, stderr)
    440 

CalledProcessError: Command '['/home/ec2-user/anaconda3/envs/pytorch_p36/bin/python', '/home/ec2-user/.cache/pants_jupyter_plugin/pex/exes/pex-2.1.32.pex', '-m', 'pex.tools', 'tmp/ml_training_env.pex', 'interpreter', '-v']' returned non-zero exit status 1.
but
%pex_load dist/ml.ml-models.src.python.training_env/ml_training_env.pex
works locally
Idk what difference there is between local jupyter and sagemaker notebook... and more importantly i don't see a more detailed log from the
CalledProcessError
above
h
What Pants version is this? I'm wondering if you set the fields
execution_mode
and/or
include_tools
on the
pex_binary
If you're on Pants 2.7 and willing to share,
./pants peek path/to:pex_binary_tgt
would be helpful
a
2.6.0
👍 1
if you set the fields 
execution_mode
 and/or 
include_tools
 on the 
pex_binary
no i did not
but it works locally
Copy code
pex_binary(
    name="ml_training_env",
    entry_point="imports.py",
    dependencies=[...],
)
h
Hm okay. This is really handwavey - not sure this will make a difference. But could you please try
'unzip'
and
'venv'
?
https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode
e
I'm (we're?) a bit notebook dumb. Can you give instructions for running Sagemaker at the version you're using? Also, is this pants-jupyter-plugin 0.0.3?
a
yes
Copy code
Name: pants-jupyter-plugin
Version: 0.0.3
the sagemaker env has py36, idk if that's the reason
but looks like pants-jupyter-plugin supports py36 too
Looks like using a py37 kernel in the sagemaker notebook worked...
but it doesn't seem to have overridden the packages in the environment
e
Its taking a while. I have Sagemaker spun up with a notbook running based on the default datascience image. Uploading a pants.pex to see if that works with the pants_lupyter_plugin ....
a
I found a way to make it work. Basically, pex loaded dependencies, whenever it duplicates an existing dependency, seems to be overriden
i tried putting the pex path in
sys.path
to the first but it still doesn't get prioritized unless I remove the other path that has those packages
even tried
rm -rf
and still didn't work
e
Ok - you're way ahead of me. Still catching up on a bunch of context I don't have. Be with you shortly I hope.
a
from my experiments it looks like: 1. pants-jupyter-plugin is only working with py37 not py36, even if on the github readme it says it works with py36? 2. there are some duplicate dependency override issues that is annoying to work with (yah, whenever you're messing with path, it's not fun 🤷) had to do something in
sys.path
. If
pants-jupyter-plugin
could automatically change
sys.path
to put whatever is in the loaded pex at top priority, that'll be ideal
e
Ok, so I just got %pex_load working. Can you describe a series of steps for me to do wuing a py36 PEX that get me to a failure or unexpected result?
So @ambitious-student-81104 - on point 1 - what version of python is the notebook kernel running under? The default SageMaker Datascience image is 3.7; so when loading a PEX only the 3.7 platform-specific distributions in it will be loaded. If I switch to the default python 3.6 image and run the kernel with that, then I can %pex_load a PEX with python 3.6 platform specific distributions.
So, for example, I cannot load pants.2.8.0.dev1.pex in a Python 3.6 notebook on sagemaker - which makes sense since pants.2.8.0.dev1.pex only has Python 3.{7,8,9} platform specific wheels in it since Pants 2.8.x only supports Python 3-9 .
On point 2: https://github.com/pantsbuild/pants-jupyter-plugin/blob/4b230cc4969bddf1c30d49edb3bf4e15fe7d941d/pants_jupyter_plugin/pex.py#L154-L166 So, you're right, pants-jupyter-plugin appends `%pex_load`'d sys.path entries right now and I can't see why adding a toggle to prepend instead would be a bad feature. Are you willing to file a feature request for that? If you'd like to take a crack at implementing, I think the only tricky bit is how to plumb that option through the
%pex_load
magic.
Maybe just
%pex_load_insert
vs
%pex_load
, but I'm completely ignorant of what's possible / normal with custom magics - maybe there is a way to support something like `%pex_load --prepend ./a.pex`and have that not look totally abnormal to someone used to using these sort of things. This is where your experiene filing an issue will help alot.
1
a
thank you @enough-analyst-54434