Hi all, when I build a python binary, it need lots...
# general
b
Hi all, when I build a python binary, it need lots of dependencies and use
/tmp
to do the job, but I don’t have enough space for
/tmp
, which option can point it to another directory ?
h
Hi, sorry about this issue! What is the full error message? Can you please run with
./pants --print-exception-stacktrace
?
Is this also running
./pants binary
on the BUILD file from yesterday with
tensorflow
, or something else?
b
yes, same case, I just mounted /tmp to another place and it has enough space now, not able to umount it now … but the error happened when i build the binary, it says no space left on the disk
h
Okay, I’m glad you were able to get a workaround. Let us know if that stops working + apologies that the workaround is hacky. We’ve been thinking about ways to address this in the channel #performance. The interesting edge case is building tensorflow, which is enormous. I don’t think anyone has encountered this before with smaller dependencies like
pytest
and
boto3
. But, it’s very important to us Pants developers that we have a strong story around data science workflows, such as being able to use tensorflow, so we’ll need to figure this out.
Hey @bulky-evening-62934, were you able to get everything working? Any other issues you’re running into or questions you have with trying out Pants?
b
Thanks for checking in, I got a bit confused by the way that pants looks for interpreter, I set
Copy code
interpreter_constraints = "CPython>=3.7"
interpreter_search_paths = "<PYENV>"
in
pants.toml
, But it still tries to create a virtual env by py36 (I configure the PYENV_ROOT and there is 3.7.5 installed)
h
Ah, so what’s going on here is a feature for list options that allows you to add to the list if you just give a single string. This will append
CPython>=3.7
to the default, rather than overriding the value. you’ll want to do this to instead override the value:
Copy code
[python-setup]
interpreter_constraints = ["CPython>=3.7"]
interpreter_search_paths = ["<PYENV>"]
You can run
./pants help-advanced python-setup
for more information on the options, including the expected types. -- I’m sorry for this confusion. I think this is probably a misfeature. Imo, it would be better if we made you be explicit in adding instead of overriding by doing this:
Copy code
interpreter_constraints.add = ["CPython>=3.7"]
Opened https://github.com/pantsbuild/pants/issues/9509. Feel free to weigh in if you think it would indeed have made things less confusing 🙂 Also, did my suggestion actually end up fixing it or do you still have the problem?
b
That makes lots of sense, I change to what you suggested but still not work, it tries to create the virtual env from a system default python not the one in pyenv, Basically what I did is cloning the repo to a remote server and do
./pants binary XXXX
, but it stucks at creating the venv to make pants work
h
it stucks at creating the venv to make pants work
Okay, this sounds like the
./pants
script is the issue, rather than this option. So, there are possibly multiple Python interpreters that Pants will end up using. The first is the Python interpreter that Pants uses to run itself - this must be Python 3.6+ and will be determined by the
./pants
script. You can override it by setting the env var
PYTHON
, e.g.
export PYTHON=python3.7
The other is what Pants uses for subprocesses when it does things like runs Pytest. This doesn’t need to be the same interpreter that you use to run Pants itself; it might be, but it doesn’t need to be. For example, Pants supports if you have Python 2 code. Even though Pants needs Python 3.6 to run, it can still run code that is Python 2. This is like how Black and MyPy need Python 3 to run but work on Python 2 code. Those two
python-setup
options you’re setting only impact this interpreter selection; they don’t impact what is used by
./pants
. Here, try setting
PYTHON=python3.7
(or the more precise path if you have it, e.g. the output of
which python3.7
)
but it stucks at creating the venv to make pants work
Hm, do you have an error message for this, by chance?
b
oh, my error message is straightforward, it needs
python.h
, so I fix it by installing
python3-dev
thanks for your explanation, I will look at the pants script to understand how it uses python 3.6 To make sure that I understand everything correctly, the
interpreter_constraints
is not applied to the python version used by pants itself ?
❤️ 1
BTW, I got the full exception stacktrace for the “not enough space left” in /tmp ,
i don’t like to solution to mount /tmp to another place, is there an option/env var to make it use another place instead of
/tmp
?
h
Yes, that is correct that interpreter_constraints do not apply to the python version used to run Pants itself, only what is used for subprocesses
@enough-analyst-54434 it looks like the out of space error is happening on the pex side of things. Is there a way to use a different folder than /tmp for resolution? This feels relevant to your PEX_ROOT changes
e
Reading above I see
/tmp/process-executionIeivdN/...
, IOW pex is running here in v2 locally and the v2 local sandbox system works in
/tmp
by default, so that needs to be moved in @bulky-evening-62934's case. Assuming that's right, I'm looking to see if we have a knob for this....
👍 2
h
Ah, interesting. Stu thought yesterday that there is no knob. I’m wondering if there is an alternative solution to avoid using so much memory in the first place? Stu pointed out that your caching changes wouldn’t impact the total amount of space used. Danny pitched his symlinks PR. Outside of that, I’m not sure there’s any way around the fact that some dists like tensorflow are enormous
e
There ~is a knob: https://github.com/pantsbuild/pants/blob/a7eb868f19d8ad9fc5cf226bf903f03f49c1b22f/src/rust/engine/src/context.rs#L162-L167 https://doc.rust-lang.org/std/env/fn.temp_dir.html So the TMPDIR env var. We probably need to plumb this anyhow due to env hermeticity and for discoverability.
By "so much memory" I assume you mean so much disk. But, yeah - no, I think there is no real way around this.
👍 1
... except to add the knob.
👍 1
Basically, I see no need to get fancy (symlinks) or try to count pennies (cache less). If a user can download tensorflow at all themselves and work with it, this implies they have a machine with enough disk. The only real problem is they may not have it partitioned in a way that works with how we use the disk (/tmp). So if we give the knob I think we solve almost all reasonable cases.
👍 1
h
By “so much memory” I assume you mean so much disk.
Ah, yes, I did mean this. Okay, sounds good. I’ll open an issue and ask Greg if it’d be easy for him to knock out.
h
actually according to the rust docs @enough-analyst-54434 linked, the library already pays attention to an env var called
TMPDIR
and just testing this out locally, it looks like if I set that var and run pants, I see tmp files written to whatever dir I set it to
e
OK - I was naively surprised we let TMPDIR leak, but this is not an EPR its the runner behind an EPR. Makes sense.
h
although it looks like pip doesn't pay attention to this var - I"m seeing
pip-
temp files show up in /tmp still
so, we could just tell people "set `TMPDIR`" and not actually make any changes to pants
e
Pip supports
PIP_...
https://pip.pypa.io/en/stable/user_guide/#environment-variables but Pex runs Pip in isolated mode, so no use I think:
Copy code
--isolated                  Run pip in an isolated mode, ignoring environment variables and user configuration.
h
or just make pex use
TMPDIR
and then tell people to use it without making any changes to pex proper
looks like
wheel_dir
is the option pip cares about?
e
Pex already sets this to a subdir of the PEX_ROOT which Pants configures.
h
we set that to
./pex_root
, but I"m not sure what dir that option takes to be the
.
e
So, no, but there are tmp dirs Pip uses dfor its own caches when building wheels if it needs to. I think its only those that Pex does not currently control. All other output locations Pex controls.
h
oh just whatever the EPR root is I guess
e
Yes.
h
so that should already be handled by the
TMPDIR
var
so it's just htese Pip tempdirs that we should expect to see being still written to /tmp
e
Yes. See above though re Pip using still uncontrolled by Pex tmp dirs for building sdists and wheels.
so it's just htese Pip tempdirs that we should expect to see being still written to /tmp
Yes.
h
so, it sounds like possibly a change to pex is necessary to tell it to tell pip not to use /tmp
but other than that the person building tensorflow should be able to set
TMPDIR
today
I'm not sure how much of the disk space used by building tensorflow comes from these pip files
h
@bulky-evening-62934 could you please test if setting
TMPDIR
fixes things when you have a spare moment?
👍 1
e
Fwict tensorwflow is unbuildable by Pip - I'm pretty sure only pre-built wheels are used / downloaded. There may be other dependencies that do get built though.
Yeah - there is no tensorflow sdist, at least for modern tensorflow, so it's not built by Pip.
h
isn't most of the disk space used by tensorflow coming from frozen graph files, which should be opaque to the build system? or is that only a runtime consideration
I've only used tensorflow a little bit and I'm not sure exactly how those frozen graph files are used
anyway, assuming "Just set `TMPDIR`" works, I'm not sure where the right place to document it is
is it worth it to go to the trouble of creating a pants option that would just set this var, just so we have a place to document it?
h
Maybe add to a troubleshooting section of the docs? Another issue people seem to encounter is needing to configure
ulimit
. That would be good to include in the same section
b
@hundreds-father-404 thanks for all those informations, I just tested it, setting TMPDIR makes some difference but still failed because I see
--cache-dir /tmp/tmprqaeuwas
the reason of having that small
/tmp
is because I’m using s3, and its disk is mounted on
/mnt
👍 2
h
@enough-analyst-54434 do we know where that
--cache-dir
is getting set?
e
Yes, here: https://github.com/pantsbuild/pants/blob/1b180b09920437db14bb95afe9c0a02391614df5/src/python/pants/backend/python/rules/download_pex_bin.py#L85-L97 So its currently unset which causes Pex to use a tmpdir. We'd need to change DownloadedPexBin to stop using
--disable-cache
and use a local (to EPR sandbox), throwaway PEX_ROOT like its base class HermeticPex already does: https://github.com/pantsbuild/pants/blob/55d2d02b8e30a97c8e62f6ddf9b9f57337647354/src/python/pants/backend/python/rules/hermetic_pex.py#L52-L60 Pex 2.1.7, which is in the backtrace here, has sanitized PEX_ROOT handling (https://github.com/pantsbuild/pex/pull/929) so this should work out. I'll send up a PR...
💯 2
b
Hi, thanks for your fix, I tested
1.27.0.dev3
,
--cache-dir
now could use TMPDIR, but it still failed because
resolved_dists
is still using
/tmp
,
Copy code
/home/ubuntu/.pyenv/versions/3.7.5/bin/python3.7 /mnt/tmp/process-executionhdCFxz/pex_root/pip.pex/49a647f5ed9e81dc8f85560600c53fbb4dfc60f0 --disable-pip-version-check --isolated --no-python-version-warning -q --cache-dir /mnt/tmp/process-executionhdCFxz/pex_root install --no-deps --no-index --only-binary :all: --target /mnt/tmp/process-executionhdCFxz/pex_root/installed_wheels/35830898f97169961ccf9a54a496b28734a1142b/tensorflow-2.1.0-cp37-cp37m-manylinux2010_x86_64.whl.ada6572daa8649f29e04fa85ea710176 --no-compile /tmp/tmp3_fu_pdh/resolved_dists/cp37-cp37m/tensorflow-2.1.0-cp37-cp37m-manylinux2010_x86_64.whl
(see the end of the command above)