https://pantsbuild.org/ logo
#general
Title
# general
w

wide-energy-11069

09/05/2018, 7:50 PM
django apps must be pexed up as not zip safe
Is there a flag somewhere to say so?
w

witty-crayon-22786

09/05/2018, 7:57 PM
w

wide-energy-11069

09/05/2018, 7:58 PM
thanks!
unfortuanately that doesn’t do the trick 😕
w

witty-crayon-22786

09/05/2018, 8:00 PM
this needs a repro
assuming the bisect was good, you'd want to fiddle with adding/removing an explicit
requirements.txt
file maybe...?
although it sounds like you're now looking directly at the `run`/`binary` difference, rather than at that sha
which is probably reasonable.
w

wide-energy-11069

09/05/2018, 8:04 PM
yeah, same pex, but works unzipped beforehand https://pantsbuild.slack.com/archives/C046T6T9U/p1536173778000100
w

witty-crayon-22786

09/05/2018, 8:05 PM
did you
clean-all
after changing
zip_safe
...?
(...again... repro)
w

wide-energy-11069

09/05/2018, 8:15 PM
not trivial to minimize the repro, but yeah, working on it now.
w

witty-crayon-22786

09/05/2018, 8:16 PM
if you have an internal repro, and you can confirm that reverting that commit fixes it, temporarily reverting it might be ok. but that commit represents a bugfix, so still need to get to the bottom of this
w

wide-energy-11069

09/05/2018, 8:19 PM
if you have an internal repro, and you can confirm that reverting that commit fixes it,
can confirm this already
e

enough-analyst-54434

09/05/2018, 8:33 PM
I am not in favor of a ghost revert of this form in general. A repro or explanation would be wonderful.
w

witty-crayon-22786

09/05/2018, 8:34 PM
yea, agreed. i should have been clearer that that was predicated on your agreement
e

enough-analyst-54434

09/05/2018, 8:42 PM
So @wide-energy-11069, did you compare pex contents? Different resolved .deps/ different user-files?
The only other thing I can think of is requirements.txt getting included in pex user-files and some library finding and reading that and doing something with it. Far fetched.
That actually has to be it. Nothing else is relevant to that change.
w

wide-energy-11069

09/05/2018, 8:53 PM
one moment
w

witty-crayon-22786

09/05/2018, 8:54 PM
are
files
targets actually included in pexes? i would think that that would only be
resources
(sorry, i need to stop interjecting)
e

enough-analyst-54434

09/05/2018, 8:55 PM
I was just as lazy
Yeah - just checked - should be
Confirming
e

enough-analyst-54434

09/05/2018, 8:56 PM
yeah - 3rdparty will have requirements.txt files now
For pants.pex:
Copy code
$ zipinfo dist/pants_local_binary.pex | grep 3rdparty/
-rw-r--r--  2.0 unx      658 b- defN 18-Sep-05 14:56 3rdparty/python/requirements.txt
-rw-r--r--  2.0 unx      112 b- defN 18-Sep-05 14:56 3rdparty/python/twitter/commons/requirements.txt
-rw-r--r--  2.0 unx       27 b- defN 18-Sep-05 14:56 pants-plugins/3rdparty/python/requirements.txt
w

wide-energy-11069

09/05/2018, 8:58 PM
why is that a problem?
Only in extracted_pex_bad/: 3rdparty
e

enough-analyst-54434

09/05/2018, 8:58 PM
Wild ass guess
w

witty-crayon-22786

09/05/2018, 8:58 PM
@wide-energy-11069: see if you repro with it removed.
...from the extracted pex
(or wherever)
e

enough-analyst-54434

09/05/2018, 8:58 PM
A 3rdparty library could do crazy stuff, see the file and try to resolve stuff
So Files are included relative to the buildroot and Resources take source root into account. This is as expected as part of --chroot for example for test pexes.
w

wide-energy-11069

09/05/2018, 9:07 PM
Do I have to anything special when zipping up the extracted directory?
no_3rdparty.pex
is zip with 3rdparty removed
e

enough-analyst-54434

09/05/2018, 9:07 PM
Yes, prepend a shebang line
echo "#!/usr/bin/env python" | cat - no_3rdparty.pex > no_3rdparty.pex.executable
something like that
w

wide-energy-11069

09/05/2018, 9:12 PM
adding the shebang works. thanks
e

enough-analyst-54434

09/05/2018, 9:14 PM
For completeness you might want to strace or whatever that is on OSX to see if the requirements.txt files are accessed in unexpected ways
I only suggest that because this is just simply the only logical difference - the presence of those files in the pex and the possibility some code is reading them and acting differently as a result
Oh - wait, that new pex has the files omitted
No clue
w

wide-energy-11069

09/05/2018, 9:16 PM
why does
extracted_pex_bad/3rdparty/python/requirements.txt
exist now, but not before?
e

enough-analyst-54434

09/05/2018, 9:17 PM
Did you look at the change you bisected? It adds a dependency from every PythonRequirementLibrary generated from a requirements.txt to that file.
Its a tiny change
w

wide-energy-11069

09/05/2018, 9:20 PM
indeed, that’s why i’m quite confused 🙂
e

enough-analyst-54434

09/05/2018, 9:21 PM
Heh
Yeah, so that's why requirements files are now included in the pex and that should be the only difference introduced by that change
The alternative would be to subclass PythonRequirmentLibrary in the macro so that the subclass could accept the requirments file path or contents. The subclass would then mix that into its payload fingerprint. Same effect without the file dependency and thus without including the file in pexes.
You could most easily test this with a local twitter loose plugin that overwrote the
python_requirements
build file alias.
Many things consume dependencies of course, but that bit is what is relevant to generated binary zipped pexes.
For run its a different code path...
One other side-effect of including requirements.txt in pexes now is that zip_safe is forced False: https://github.com/pantsbuild/pants/blob/97e1ef1dced5aa01243ca0a90e53f332fecb5f37/src/python/pants/backend/python/tasks/pex_build_util.py#L51-L59
Now - generally (always?) - its the opposite that would cause damage.
I've never met a pex that runs zipped but not loose, whereas the opposite happens as mentioned earlier with django, etc.
But this could be relevant.
w

wide-energy-11069

09/05/2018, 9:41 PM
Yeah fair.
e

enough-analyst-54434

09/05/2018, 9:41 PM
Have you tried mv ~/.pex ~/.pex.backup and then running the failing pex?
e

enough-analyst-54434

09/05/2018, 9:42 PM
IIRC use of the local ~/.pex cache varies between zip_safe pexes and non
Absolutely expected. 1 requirements.txt file generates N PythonRequirmentLibraries
All N will have a dep back to the 1 Files target
Well, those are 2 different requirments.txt files, so two different python_requirments targets printing out in your gist
If I read it correctly
w

wide-energy-11069

09/05/2018, 9:44 PM
right
e

enough-analyst-54434

09/05/2018, 9:45 PM
If you can run this filing thing locally the ~/.pex/ dir move aside is a quick thing to cross off the list
w

wide-energy-11069

09/05/2018, 9:46 PM
sorry, not following
I’ve never met a pex that runs zipped but not loose, whereas the opposite happens as mentioned earlier with django, etc.
That’s our observation as well. loose works, but not zipped
e

enough-analyst-54434

09/05/2018, 9:47 PM
But my point above is that zipped runs loose too
Because a Files dep now forces zip_safe False
w

wide-energy-11069

09/05/2018, 9:49 PM
can I imagine it’s equivalent to 1. unzipped the pex to
tmpdir
2.
python tmpdir/
Yes - its that, but handled by pex internally
w

wide-energy-11069

09/05/2018, 9:50 PM
would PEX_INFO reflect that as well?
Copy code
$ rg 'zip_safe' extracted_pex_bad/PEX-INFO 
...
 "zip_safe": true}
e

enough-analyst-54434

09/05/2018, 9:53 PM
It should
$ unzip -qc dist/pants_local_binary.pex PEX-INFO | jq . | grep zip_safe "zip_safe": false
And our pants binary is not zip_safe: False via BUILD
So your result is highly suspicious. I'd stop and understand why that value is wrong. Its not the value produced by Pants code on master.
w

wide-energy-11069

09/05/2018, 9:56 PM
The specific target is using the default value before, which should be zip_safe = True
Overrides that and sets False
So figure out why you're not running pants master code here
Something is different
And by master I just mean Pants code
w

wide-energy-11069

09/05/2018, 9:58 PM
The only SHAs I’ve been using is ab5aeeb34512f50fdaf4e01bf005559c3ccbe08d and its previous commit on 1.9.x branch
e

enough-analyst-54434

09/05/2018, 9:59 PM
If you can verify your sha has the same code I just pointed to then something local is intervening
zip_safe should be false because of the Files dep python_requirements now injects
No matter the BUILD setting
That was my point with the
Copy code
$ unzip -qc dist/pants_local_binary.pex PEX-INFO | jq . | grep zip_safe
 "zip_safe": false
Because Pants itself too is default zip_safe, aka True
w

wide-energy-11069

09/05/2018, 10:04 PM
Ah you are right.
Copy code
[tw-mbp-yic source (master)]$ unzip -qc dist/bad.pex  PEX-INFO | jq . | grep zip_safe
unzip:  cannot find or open dist/bad.pex, dist/bad.pex.zip or dist/bad.pex.ZIP.
[tw-mbp-yic source (master)]$ unzip -qc dist/good.pex  PEX-INFO | jq . | grep zip_safe
warning [dist/good.pex]:  25 extra bytes at beginning or within zipfile
  (attempting to process anyway)
  "zip_safe": true
bad = ab5aeeb34512f50fdaf4e01bf005559c3ccbe08d, good = ab5aeeb34512f50fdaf4e01bf005559c3ccbe08d^
oops accidentally removed it. point is
bad.pex
has
zip_safe: False
, and
good.pex
has
zip_safe: True
e

enough-analyst-54434

09/05/2018, 10:06 PM
OK, so 2 differences. Bad has requirments.txt files included in the pex and is zip_safe=False
w

wide-energy-11069

09/05/2018, 10:06 PM
Correct
e

enough-analyst-54434

09/05/2018, 10:07 PM
This is why I suspected a library scanning for requirments.txt files and doing something with them
That was the strace debugging idea.
But you hand-created a pex w/o the requirements.txt files included and still failed
Which leaves only zip_safe: False as the difference in that pex
Can you go back to good commit and modify the binary targets BUILD to zip_safe: False
and see if that then fails?
w

wide-energy-11069

09/05/2018, 10:10 PM
Yes, that is the case. Diff:
Copy code
[tw-mbp-yic pants (good)]$ git diff
diff --git a/src/python/pants/backend/python/targets/python_binary.py b/src/python/pants/backend/python/targets/python_binary.py
index 45565fc..89c301c 100644
--- a/src/python/pants/backend/python/targets/python_binary.py
+++ b/src/python/pants/backend/python/targets/python_binary.py
@@ -77,7 +77,7 @@ class PythonBinary(PythonTarget):
     payload.add_fields({
       'entry_point': PrimitiveField(entry_point),
       'inherit_path': PrimitiveField(inherit_path),
-      'zip_safe': PrimitiveField(bool(zip_safe)),
+      'zip_safe': PrimitiveField(bool(False)),
       'always_write_cache': PrimitiveField(bool(always_write_cache)),
       'repositories': PrimitiveField(maybe_list(repositories or [])),
       'indices': PrimitiveField(maybe_list(indices or [])),
hardcoding it
False
made it fail on the good commit
iiuc,
zip_safe=False
would cause an unintended file scan?
e

enough-analyst-54434

09/05/2018, 10:12 PM
It causes nothing. I just threw out a completely wild guess that there was a library that scanned the filesystem somehow for requirements files
A zipped pex would defeat normal filesystem scans
Unless the library was smart enough to check for being inside a zip somehow.
w

wide-energy-11069

09/05/2018, 10:12 PM
defeat as in cause it to error?
e

enough-analyst-54434

09/05/2018, 10:13 PM
This is all speculation
No clue Yi! If the library trapped exceptions the scan failure could be silent
I'm just guessing what an evil library might be doing
And it may not evin be evil, this guessed behavior could be documented even
Well yay on narrowing this to zipsafe=False
Let me underline a bit, that the fact that causes a failure is insane
w

wide-energy-11069

09/05/2018, 10:15 PM
if anything, specifying
zip_safe=False
should make it more reliable right?
e

enough-analyst-54434

09/05/2018, 10:15 PM
That said, this is definitley an undesirable side-effect of including a Files dep in python_requirments
Correct
But since it causes failure you have to think devious thoughts
There is clearly a devious library involved here
if anything, specifying
zip_safe=False
should make it more reliable right? (edited)
And faster for repeated runs!
Perhaps the python_binary authors know about the libraries they are using and scanning for requirements files might be a phrase they aha on
Maybe worth a shot
w

wide-energy-11069

09/05/2018, 10:20 PM
are you saying some 3rdparty lib could be doing their own scanning on requirements.txt?
hence the zip mode, it will fail to find ours and works accidentally?
e

enough-analyst-54434

09/05/2018, 10:21 PM
Yes
That is the wild ass guess I've been referring to
Its also where strace comes in. If you note a requirements file read and then some socket operations, fishy
w

wide-energy-11069

09/05/2018, 10:22 PM
Gotcha. thanks so much John! Will circle back
e

enough-analyst-54434

09/05/2018, 10:22 PM
Excellent
w

wide-energy-11069

09/06/2018, 9:39 PM
Found a ‘fix’:
Copy code
-from twitter.infraops.ims.ndb.models.ndb import Base
+from twitter.infraops.ims.ndb.models.ndb import *
whatever the user code does with SQLalchemy seems high dependent on the import order and its side effects
e

enough-analyst-54434

09/06/2018, 9:40 PM
Aha. Ok, that I'm happy to class as not a pants problem. Great! Thanks for digging on that Yi!!
w

wide-energy-11069

09/06/2018, 9:41 PM
Sure thing. Thanks so much for the help too!
5 Views