Hello :slightly_smiling_face:. I am having some is...
# general
a
Hello 🙂. I am having some issues with modules not being available when using
multiprocess
. I build separate pex binaries for dependencies and source files. I then create and combine
oci_layer
's to form an image and execute the pex binary as follows. If I install the missing libraries in the global python environment the code will run, which leads me to believe when the process is spawned it doesn't use the pex environment/dependencies. Any clues on what I might be doing wrong? Code in thread
Copy code
pex_binary(
    name="deps",
    entry_point="run.py",
    complete_platforms=["//:platforms"],
    layout="packed",
    include_sources=False,
    include_tools=True,
)

oci_layer(name="deps-layer", packages=[":deps"])

# Create pex layer for python source code
pex_binary(
    name="srcs",
    entry_point="run.py",
    complete_platforms=["//:platforms"],
    layout="packed",
    include_requirements=False,
    include_tools=True,
)

oci_layer(name="srcs-layer", packages=[":srcs"])

oci_image_build(
    name="image",
    layers=[":deps-layer", ":srcs-layer"],
    base=["//:python-bookworm-slim-amd"],
    repository=env("REPOSITORY", "") + "datascience/xgboost",  # Required to publish
    tag=env("GIT_SHA", "latest"),  # Required to publish
    entrypoint="python3",
    args=[dirctory_no_slash + "/srcs.pex"],
    env=[
        "PEX_PATH=/" + dirctory_no_slash + "/srcs.pex:/" + dirctory_no_slash + "/deps.pex",
    ],
)
g
Does this repro natively? Are you using
fork
or
spawn
?
a
Using spawn
What do you mean by reproduce natively?
g
Ie if you package the two pexes and run them directly with the env path, does it still occur?
a
Package them as a single pex?
g
No, exactly same setup as in the container
a
We get the same issue
g
Does fork fix it? If the code works at all then.
My thought is that it might be bypassing a bunch of the PEX bootstrap logic
a
That works. I'm told that we didn't do that because of Polars, and how they (the datascientists) were running it locally (not with PEX) it would just hang. But now it seems to work! https://docs.pola.rs/user-guide/misc/multiprocessing/
g
Makes sense. I recognize this as an issue but can't remember any good solutions. Maybe reducing a repro and filing on the PEX repository would be helpful if there isn't any good previous issue for it. It might be possible to alter the multiprocessing args in some way to make it work. I do think fork is a footgun here despite it working.
1
a
Thanks for your help 🙏