> django apps must be pexed up as not zip safe ...
# general
w
django apps must be pexed up as not zip safe
Is there a flag somewhere to say so?
w
w
thanks!
unfortuanately that doesn’t do the trick 😕
w
this needs a repro
assuming the bisect was good, you'd want to fiddle with adding/removing an explicit
requirements.txt
file maybe...?
although it sounds like you're now looking directly at the `run`/`binary` difference, rather than at that sha
which is probably reasonable.
w
yeah, same pex, but works unzipped beforehand https://pantsbuild.slack.com/archives/C046T6T9U/p1536173778000100
w
did you
clean-all
after changing
zip_safe
...?
(...again... repro)
w
not trivial to minimize the repro, but yeah, working on it now.
w
if you have an internal repro, and you can confirm that reverting that commit fixes it, temporarily reverting it might be ok. but that commit represents a bugfix, so still need to get to the bottom of this
w
if you have an internal repro, and you can confirm that reverting that commit fixes it,
can confirm this already
e
I am not in favor of a ghost revert of this form in general. A repro or explanation would be wonderful.
w
yea, agreed. i should have been clearer that that was predicated on your agreement
e
So @wide-energy-11069, did you compare pex contents? Different resolved .deps/ different user-files?
The only other thing I can think of is requirements.txt getting included in pex user-files and some library finding and reading that and doing something with it. Far fetched.
That actually has to be it. Nothing else is relevant to that change.
w
one moment
w
are
files
targets actually included in pexes? i would think that that would only be
resources
(sorry, i need to stop interjecting)
e
I was just as lazy
Yeah - just checked - should be
Confirming
e
yeah - 3rdparty will have requirements.txt files now
For pants.pex:
Copy code
$ zipinfo dist/pants_local_binary.pex | grep 3rdparty/
-rw-r--r--  2.0 unx      658 b- defN 18-Sep-05 14:56 3rdparty/python/requirements.txt
-rw-r--r--  2.0 unx      112 b- defN 18-Sep-05 14:56 3rdparty/python/twitter/commons/requirements.txt
-rw-r--r--  2.0 unx       27 b- defN 18-Sep-05 14:56 pants-plugins/3rdparty/python/requirements.txt
w
why is that a problem?
Only in extracted_pex_bad/: 3rdparty
e
Wild ass guess
w
@wide-energy-11069: see if you repro with it removed.
...from the extracted pex
(or wherever)
e
A 3rdparty library could do crazy stuff, see the file and try to resolve stuff
So Files are included relative to the buildroot and Resources take source root into account. This is as expected as part of --chroot for example for test pexes.
w
Do I have to anything special when zipping up the extracted directory?
no_3rdparty.pex
is zip with 3rdparty removed
e
Yes, prepend a shebang line
echo "#!/usr/bin/env python" | cat - no_3rdparty.pex > no_3rdparty.pex.executable
something like that
w
adding the shebang works. thanks
e
For completeness you might want to strace or whatever that is on OSX to see if the requirements.txt files are accessed in unexpected ways
I only suggest that because this is just simply the only logical difference - the presence of those files in the pex and the possibility some code is reading them and acting differently as a result
Oh - wait, that new pex has the files omitted
No clue
w
why does
extracted_pex_bad/3rdparty/python/requirements.txt
exist now, but not before?
e
Did you look at the change you bisected? It adds a dependency from every PythonRequirementLibrary generated from a requirements.txt to that file.
Its a tiny change
w
indeed, that’s why i’m quite confused 🙂
e
Heh
Yeah, so that's why requirements files are now included in the pex and that should be the only difference introduced by that change
The alternative would be to subclass PythonRequirmentLibrary in the macro so that the subclass could accept the requirments file path or contents. The subclass would then mix that into its payload fingerprint. Same effect without the file dependency and thus without including the file in pexes.
You could most easily test this with a local twitter loose plugin that overwrote the
python_requirements
build file alias.
Many things consume dependencies of course, but that bit is what is relevant to generated binary zipped pexes.
For run its a different code path...
One other side-effect of including requirements.txt in pexes now is that zip_safe is forced False: https://github.com/pantsbuild/pants/blob/97e1ef1dced5aa01243ca0a90e53f332fecb5f37/src/python/pants/backend/python/tasks/pex_build_util.py#L51-L59
Now - generally (always?) - its the opposite that would cause damage.
I've never met a pex that runs zipped but not loose, whereas the opposite happens as mentioned earlier with django, etc.
But this could be relevant.
w
Yeah fair.
e
Have you tried mv ~/.pex ~/.pex.backup and then running the failing pex?
e
IIRC use of the local ~/.pex cache varies between zip_safe pexes and non
Absolutely expected. 1 requirements.txt file generates N PythonRequirmentLibraries
All N will have a dep back to the 1 Files target
Well, those are 2 different requirments.txt files, so two different python_requirments targets printing out in your gist
If I read it correctly
w
right
e
If you can run this filing thing locally the ~/.pex/ dir move aside is a quick thing to cross off the list
w
sorry, not following
I’ve never met a pex that runs zipped but not loose, whereas the opposite happens as mentioned earlier with django, etc.
That’s our observation as well. loose works, but not zipped
e
But my point above is that zipped runs loose too
Because a Files dep now forces zip_safe False
w
can I imagine it’s equivalent to 1. unzipped the pex to
tmpdir
2.
python tmpdir/
Yes - its that, but handled by pex internally
w
would PEX_INFO reflect that as well?
Copy code
$ rg 'zip_safe' extracted_pex_bad/PEX-INFO 
...
 "zip_safe": true}
e
It should
$ unzip -qc dist/pants_local_binary.pex PEX-INFO | jq . | grep zip_safe "zip_safe": false
And our pants binary is not zip_safe: False via BUILD
So your result is highly suspicious. I'd stop and understand why that value is wrong. Its not the value produced by Pants code on master.
w
The specific target is using the default value before, which should be zip_safe = True
Overrides that and sets False
So figure out why you're not running pants master code here
Something is different
And by master I just mean Pants code
w
The only SHAs I’ve been using is ab5aeeb34512f50fdaf4e01bf005559c3ccbe08d and its previous commit on 1.9.x branch
e
If you can verify your sha has the same code I just pointed to then something local is intervening
zip_safe should be false because of the Files dep python_requirements now injects
No matter the BUILD setting
That was my point with the
Copy code
$ unzip -qc dist/pants_local_binary.pex PEX-INFO | jq . | grep zip_safe
 "zip_safe": false
Because Pants itself too is default zip_safe, aka True
w
Ah you are right.
Copy code
[tw-mbp-yic source (master)]$ unzip -qc dist/bad.pex  PEX-INFO | jq . | grep zip_safe
unzip:  cannot find or open dist/bad.pex, dist/bad.pex.zip or dist/bad.pex.ZIP.
[tw-mbp-yic source (master)]$ unzip -qc dist/good.pex  PEX-INFO | jq . | grep zip_safe
warning [dist/good.pex]:  25 extra bytes at beginning or within zipfile
  (attempting to process anyway)
  "zip_safe": true
bad = ab5aeeb34512f50fdaf4e01bf005559c3ccbe08d, good = ab5aeeb34512f50fdaf4e01bf005559c3ccbe08d^
oops accidentally removed it. point is
bad.pex
has
zip_safe: False
, and
good.pex
has
zip_safe: True
e
OK, so 2 differences. Bad has requirments.txt files included in the pex and is zip_safe=False
w
Correct
e
This is why I suspected a library scanning for requirments.txt files and doing something with them
That was the strace debugging idea.
But you hand-created a pex w/o the requirements.txt files included and still failed
Which leaves only zip_safe: False as the difference in that pex
Can you go back to good commit and modify the binary targets BUILD to zip_safe: False
and see if that then fails?
w
Yes, that is the case. Diff:
Copy code
[tw-mbp-yic pants (good)]$ git diff
diff --git a/src/python/pants/backend/python/targets/python_binary.py b/src/python/pants/backend/python/targets/python_binary.py
index 45565fc..89c301c 100644
--- a/src/python/pants/backend/python/targets/python_binary.py
+++ b/src/python/pants/backend/python/targets/python_binary.py
@@ -77,7 +77,7 @@ class PythonBinary(PythonTarget):
     payload.add_fields({
       'entry_point': PrimitiveField(entry_point),
       'inherit_path': PrimitiveField(inherit_path),
-      'zip_safe': PrimitiveField(bool(zip_safe)),
+      'zip_safe': PrimitiveField(bool(False)),
       'always_write_cache': PrimitiveField(bool(always_write_cache)),
       'repositories': PrimitiveField(maybe_list(repositories or [])),
       'indices': PrimitiveField(maybe_list(indices or [])),
hardcoding it
False
made it fail on the good commit
iiuc,
zip_safe=False
would cause an unintended file scan?
e
It causes nothing. I just threw out a completely wild guess that there was a library that scanned the filesystem somehow for requirements files
A zipped pex would defeat normal filesystem scans
Unless the library was smart enough to check for being inside a zip somehow.
w
defeat as in cause it to error?
e
This is all speculation
No clue Yi! If the library trapped exceptions the scan failure could be silent
I'm just guessing what an evil library might be doing
And it may not evin be evil, this guessed behavior could be documented even
Well yay on narrowing this to zipsafe=False
Let me underline a bit, that the fact that causes a failure is insane
w
if anything, specifying
zip_safe=False
should make it more reliable right?
e
That said, this is definitley an undesirable side-effect of including a Files dep in python_requirments
Correct
But since it causes failure you have to think devious thoughts
There is clearly a devious library involved here
if anything, specifying
zip_safe=False
should make it more reliable right? (edited)
And faster for repeated runs!
Perhaps the python_binary authors know about the libraries they are using and scanning for requirements files might be a phrase they aha on
Maybe worth a shot
w
are you saying some 3rdparty lib could be doing their own scanning on requirements.txt?
hence the zip mode, it will fail to find ours and works accidentally?
e
Yes
That is the wild ass guess I've been referring to
Its also where strace comes in. If you note a requirements file read and then some socket operations, fishy
w
Gotcha. thanks so much John! Will circle back
e
Excellent
w
Found a ‘fix’:
Copy code
-from twitter.infraops.ims.ndb.models.ndb import Base
+from twitter.infraops.ims.ndb.models.ndb import *
whatever the user code does with SQLalchemy seems high dependent on the import order and its side effects
e
Aha. Ok, that I'm happy to class as not a pants problem. Great! Thanks for digging on that Yi!!
w
Sure thing. Thanks so much for the help too!