I'm slowly but surely converting our codebase to u...
# general
s
I'm slowly but surely converting our codebase to use Pants... And I've just hit the following issue: my
pex_binary
depends on a
python_library
and when it's packed up the pex contains
mylib/[python_file_i_need.py | __init__.py]
Which seems fine, I've seen it do this in other places in my codebase and work... BUT in the
__init__.py
other parts of
mylib
are imported so I'm getting:
ModuleNotFoundError: No module named ...
errors. Do I need to explicitly depend on the rest of the library (or at least the part pulled into init_.py ?
e
To tighten this up: are you saying you have the same package in multiple locations on the filesystem, not all with matching
__init__.py
contents?
s
nope sorry. I've other pex_binary's that have library dependencies where the init.py doesn't try and hoist things by importing them.
e
Do you have --inits configured for inference?: https://www.pantsbuild.org/docs/reference-python-infer
s
Ah no... offt I can see why that's not on be default. It really blew out the size of the binary Thanks 🙂
e
You're welcome. I'm not sure why it's not on by default, but it can't be due to size. Surely you absolutely need those deps if they're imported. The extra size is non-optional for a functioning app.
s
Ah In my case: My Pex needs:
mylib/python_file_i_need.py
and only that file But
mylib/files_i_dont_need.py
(which init.py import) requires lots of other parts of my monorepo and other libraries... so the pex goes from 10mb to 40mb bundling in dependencies I don't need
e
Aha. Yeah, not much we can do there. You'd need to change the code structure. Pants making a decision like "They're using a non-empty init.py but we can ignore that and insert an empty or use an implicit ns package" is problematic since the unit.py code could be there for relied upon side effects while importing.
w
yea, this would be good to enable by default. enough has changed, that i’m not sure that the original justification for leaving it disabled is still valid (we went back and forth a few times on it) cc @happy-kitchen-89482, @hundreds-father-404
…i think it had to do with needing to have BUILD files for empty
__init__.py
files…? but
tailor
makes that a non issue i think.
e
With it turned on you might be able to avoid your bloat problem with judicious manual dependency excludes (
dependencies=["!___init___.py"]
). Thats a lazy unvetted idea though.
h
Yeah, I think it should be on by default now, since tailor will do the right thing.
w
it might have even been before we had* file-level dependencies…
h
Definitely possible
h
that i’m not sure that the original justification for leaving it disabled is still valid
The motivation was because before we had the
overrides
mechanism, you would have to create a separate
python_library
target specifically for the
__init__.py
to avoid metadata for sibling files being set on the
__init__.py
. For example, assume only
utils.py
actually needs NumPy and it can't be inferred, so you set this
Copy code
python_library(dependencies=["//:numpy"])
Now, every single file in that folder and all subdirectories will transitively depend on NumPy because the
__init__.py
picked it up and you automatically depend on ancestor
__init__.py
files This was particularly egregious before dependency inference, when you declared everything explicitly in BUILD files. We reverted
__init__.py
being on by default for Pants 2.0 because it was hurting caching etc too much with v1 users migrating -- We don't live in that world anymore. Most people use dep inference now, and
overrides
gives you a workaround to the above problem We should change the default. Or possibly least try to auto-detect the behavior, i.e. only infer if there is content in the file?
👍 2
w
Thanks for the reminder!