I ve got a bit of a harder question today For background we Pants #general

I've got a bit of a harder question today: For bac...

high-yak-85899

01/20/2022, 5:33 PM

I've got a bit of a harder question today: For background, we're working with NASA's GMAT R2020a software. They have exposed a python API (although in a rather non-standard way) that we have been using fine with our previous venv/setup.py traditional way of managing python. When transitioning over to Pants, something inside their library has broken. The C++ calls that work fine outside of Pants are now showing segmentation faults. I know I've included everything with the

files

target so that the software distribution is available in the sandbox. I've checked this by running the

dependencies

goal and by running a

diff -r

on what was passed to the sandbox and what I have on disk and only saw differences in

__pycache__

folders. So, I think my question is more just looking for general guidance on what might be different about the two execution environments. It's a prebuilt distribution so it should have everything it needs but does rely on some system libraries (e.g.

libpng12.so

). Is it possible that these aren't discoverable when running in the sandbox environment?

hundreds-father-404

01/20/2022, 5:39 PM

Hey Nathanael, I recommend taking Pants out of the equation by running directly in the sandbox with the

--no-process-cleanup

flag: https://www.pantsbuild.org/docs/troubleshooting#debug-tip-inspect-the-sandbox-with---no-process-cleanup. As mentioned there, there is a

__run.sh

script that emulates what Pants is doing under-the-hood, including stripping env vars

for general guidance on what might be different about the two execution environment

The most obvious way Pants is different is that it tries to be hermetic when running things, such as stripping env vars. It might be helpful to compare something like the output of

env

in bash to the

__run.sh

script -- Is the segfault deterministic?

enough-analyst-54434

01/20/2022, 5:40 PM

Depending on what version of Pants you're running you can invoke as `./pants --no-process-cleanup ...`and you'll see lines in Pants output like:

Copy code

09:37:50.55 [INFO] Preserving local process execution dir /tmp/process-executionD1Nekz for "[some descrition of the action ...]"

You can then ecd intoter the sandbox, here

/tmp/process-executionD1Nekz

, and use the `__run.sh`script to emulate how Pants runs the process. In general you should find the issue is missing files - as you were getting towards - or missing env vars.

👍 1

coke 1

high-yak-85899

01/20/2022, 5:43 PM

Cool, yeah I had done

--no-process-cleanup

but wasn't familiar with

__run.sh

enough-analyst-54434

01/20/2022, 5:49 PM

There is a bug with

__run.sh

in its emulation of how the Rust core engine actually invokes subprocesses: the script only shows environment variables set / passed through in the positive sense - it does not actively unset all others. To simulate failure you'd need to replace the

export ...

line at the top with

env -i ...

I think.

➕ 1

enough-analyst-54434

01/20/2022, 5:51 PM

That wouldn't work of course 🤦 You'd need to run

__run.sh

that way:

env -i ... ./__run.sh

high-yak-85899

01/20/2022, 5:56 PM

Hmm, I just get the following text output to the terminal that doesn't make sense to me

Copy code

/usr/bin/python3.8
7ec9e5c95ffcb4f4bbc26579e9c026e4f342da8ef17cfe49d5237bc1361d7335

enough-analyst-54434

01/20/2022, 6:08 PM

Looks like you're in the wrong process execution sanbox. That looks like interpreter identification output, a path and a hash.

high-yak-85899

01/20/2022, 6:52 PM

Got it. Nothing looks out of the ordinary there. I see processes for • Searching for

bash

PATH

and testing it • Searching for

python

and

python3

and testing them • Finding an interpreter for

CPython

• Determining imports for my script and that's it Is there more I should be seeing of running a

pex_binary

or is that it?

hundreds-father-404

01/20/2022, 6:54 PM

Do you know at what stage the segfault is happening? For example, when building the PEX to run, or when actually running it? Related, what goal are you running?

high-yak-85899

01/20/2022, 6:56 PM

It's when actually running it. I'm using

run

to execute my module. There's a little bit of path stuff before this happens, but here's the snippet that causes things to break

Copy code

sys.path.insert(1, str(_GMAT_BIN))
import gmatpy as gmat
gmat.Setup(str(_GMAT_STARTUP))
script = path_util.get_package_root(
) / 'astranis/utils/hifi_propagator.script'
print('CHECKING SCRIPT')
# SEGFAULT happens at this call
print(gmat.LoadScript(str(script)))
print('LOADED')

high-yak-85899

01/20/2022, 6:56 PM

The ugliness is mostly thanks to NASA

high-yak-85899

01/20/2022, 6:57 PM

So the first call we make out to

gmat.Setup

works happily and then trying to load a script fails. I don't want to get too into the weeds on GMAT-specific debugging to not burden y'all with that.

hundreds-father-404

01/20/2022, 7:02 PM

Okay. So then

--no-process-cleanup

was a red herring because the

run

goal runs interactively in your repository, rather than in a temporary directory. So you won't ever see the process to inspect. Instead, you can use `./pants run --no-cleanup path/to/file.py`: https://www.pantsbuild.org/docs/reference-run#section-cleanup. The PEX will be saved to the

.pants.d

folder iirc, like

.pants.d/tmplpd86t9k/

One thing you could try is

execution_mode='venv'

on the

pex_binary

target. See https://www.pantsbuild.org/docs/reference-pex_binary#codeexecution_modecode

enough-analyst-54434

01/20/2022, 7:05 PM

On the execution_mode bit, I'm pretty sure that only affect

./pants package

and not

./pants run

- we do torturous things in

./pants run

IIRC. So that leads to another test: @high-yak-85899 does

./pants package ...

and then running the PEX produced in

dist/...

work? If so that isolates it to our

run

chicanery.

high-yak-85899

01/20/2022, 7:25 PM

Well the pex packages but, because I have to include all the gmat source code as

files

, they aren't bundled with the

pex

and that causes other file discovery issues when executing the built

pex

high-yak-85899

01/20/2022, 7:29 PM

But we do have a little bit of progress. Previously, I was including my third party directory with the

files

target and then finding it within the sandbox. When I point to it with an absolute path where it actually lives on disk, things run without errors happily.

high-yak-85899

01/20/2022, 7:29 PM

So, I might be able to get around this for now by throwing it in something like

/home/<user>/tools

and point to it that way with an absolute path.

high-yak-85899

01/20/2022, 7:30 PM

So seems like maybe I'm not getting something included when I pull everything over with the

files

target which would be surprising.

enough-analyst-54434

01/20/2022, 7:47 PM

Files targets not being included in PEXes still seems like a bug to me, we should support that. Have you tried including using

resources

instead?

high-yak-85899

01/20/2022, 7:52 PM

Yeah, I'm kind of confused how I would package anything that didn't rely on python source files (or similarly generated files). Hadn't tried resources. The above strategy works whether it's packaged first and run or just run directly with the run goal.

enough-analyst-54434

01/20/2022, 8:07 PM

The files / resources distinction is only that resources have the enclosing source root stripped. So for a python file at src/python/package/module.py, as a resource, its materialized in snaboxes and PEXes at package/module.py (the src/python being the source root here). As a file, its materialized as-is, i.e: at src/puython/package/module.py.

👍 1

enough-analyst-54434

01/20/2022, 8:09 PM

The above strategy works whether it's packaged first and run or just run directly with the run goal.

@high-yak-85899 does that mean switching to

resources

solved your issue?

high-yak-85899

01/20/2022, 8:10 PM

No, sorry, that was ambiguous. I meant that, when I referenced the files where they live on disk, I could

run

it or

package

it and execute the pex. Swapping between files and resources doesn't seem to change things.

enough-analyst-54434

01/20/2022, 8:13 PM

So, paying more attention now, you integrate this code by vendoring one of those SourceForge tarballs (exploded?) it into your source tree?

high-yak-85899

01/20/2022, 8:15 PM

Yes, but it's not actually checked into our repo. It just lives along side it as part of a bootstrap process. So it is an acceptable solution for us to move it somewhere equally discoverable on all machines (we use

~/tools

similarly for some other purposes) and not attempt to package it up.

high-yak-85899

01/20/2022, 8:15 PM

I'm mostly just curious at this point if somehow

files

isn't grabbing everything even though it seems like it is.

enough-analyst-54434

01/20/2022, 8:16 PM

Generally Pants tries hard to not support any files outside the repo root, so it seems likely to me that's at root here, and instead of failing loudly we fail silently. But that's at broad brush.

enough-analyst-54434

01/20/2022, 8:17 PM

This thread may be relevant. Different library, but similar in distribution style (it isn't, you must build it which installs `.so`s and generates a Python distribution): https://pantsbuild.slack.com/archives/C046T6T9U/p1641913522157500

enough-analyst-54434

01/20/2022, 8:20 PM

So GMAT seems mainly java? I only find a small number of python files in the main tgz from SourceForge:

Copy code

$ find . -name "*.py"
./userfunctions/python/AttitudeTypes.py
./userfunctions/python/SimpleSockets.py
./userfunctions/python/AttitudeInterface.py
./userfunctions/python/StringFunctions.py
./userfunctions/python/socket-test-drivers/AttitudeTypes.py
./userfunctions/python/socket-test-drivers/SimpleSockets.py
./userfunctions/python/socket-test-drivers/gmat-sync-mquat.py
./userfunctions/python/socket-test-drivers/AttitudeInterface.py
./userfunctions/python/socket-test-drivers/Test-mjd.py
./userfunctions/python/socket-test-drivers/gmat-sync-mjd.py
./userfunctions/python/socket-test-drivers/Cosmos180-mjd.py
./userfunctions/python/MathFunctions.py
./userfunctions/python/ArrayFunctions.py
./bin/gmatpy/gmat_py.py
./bin/gmatpy/navigation_py.py
./bin/gmatpy/__init__.py
./bin/gmatpy/station_py.py
./api/Ex_R2020a_CompleteForceModel.py
./api/Ex_R2020a_RangeMeasurement.py
./api/BuildApiStartupFile.py
./api/Ex_R2020a_FindTheMoon.py
./api/Ex_R2020a_BasicFM.py
./api/Ex_R2020a_BasicForceModel.py
./api/load_gmat.py
./api/Ex_R2020a_PropagationLoop.py
./api/Ex_R2020a_PropagationStep.py
./utilities/python/GMATDataFileManager.py
./utilities/python/ochReader.py
./utilities/python/missionInterface.py
./utilities/python/testDriver.py
./utilities/python/segment.py
./utilities/python/ochWriter.py

high-yak-85899

01/20/2022, 8:22 PM

Yes, the python is mostly just calling out to C++ or whatever other languages. The primary API entrypoint we are using is seen there in

bin/gmatpy/gmat_py.py

high-yak-85899

01/20/2022, 8:33 PM

Well, there has to be support to some extent for things outside of a repo. For instance, the python interpreter used isn't, by default, packaged hermetically with what is distributed. So there's some precedent for expecting certain things are available outside of the repo or what is packaged up.

enough-analyst-54434

01/20/2022, 8:37 PM

That much is true.

enough-analyst-54434

01/20/2022, 8:39 PM

Ok, I looked at this in more detail. Assuming your code does something like

from gmat_py import gmat_py

then this should work:

Copy code

Relevant repo subtree:
---
3rdparty/GMAT/R2020a/bin
    BUILD
    gmat_py/__init__.py
    gmat_py/gmat_py.py
    gmat_py/_gmat_py.so

3rdparty/GMAT/R2020a/BUILD:
---
resources(
    name="so",
    sources="**/*.so",
)

python_sources(
    sources="**/*.py",
    dependencies=[
        ":so",
    ]
)

pants.toml:
---
[source]
root_patterns.add = ["/3rdparty/GMAT/R2020a/bin"]

enough-analyst-54434

01/20/2022, 8:40 PM

That should allow the vendored python code + .so to be included in your PEX, unit tests to work that use this code, etc.

enough-analyst-54434

01/20/2022, 8:41 PM

Minus the way I've done the targets and pants.toml config - is this roughly what you were trying?

enough-analyst-54434

01/20/2022, 8:48 PM

I'm guessing maybe not - your description of files outside the repo as part of bootstrap. That's good since a checked in .so like I've shown only works if your fleet of machines is uniform, which is unlikely when you mix developers in. So, in that case, it seems like gmat_py needs to be treated like a JVM-style "provided" dependency; i.e. expect it to be pre-installed on the system and don't try to find it or package it. So I think this was all just me catching up to you. PEXes support

PEX_EXTRA_SYS_PATH=dir1:dir2

for this sort of thing. Is that useful?

enough-analyst-54434

01/20/2022, 8:50 PM

There's also

PEX_INHERIT_PATH=prefer|fallback

if the provided deps are expected to be found in the site-packages of the system interpreter running the PEX. See

pex --help-variables

for docs on this or see: https://pex.readthedocs.io/en/v2.1.63/api/vars.html

high-yak-85899

01/20/2022, 9:08 PM

Yeah that's somewhat similar to what I had. I'll try to work that in.

high-yak-85899

01/20/2022, 9:33 PM

Could quite get it to work. There are

.so

files (and

.so.R2020a

files) elsewhere in that directory that need to be loaded in. What I ended up with was

Copy code

Relevant repo subtree:
---
3rdparty/GMAT
    BUILD
3rdparty/GMAT/R2020a/bin
    BUILD
    gmat_py/__init__.py
    gmat_py/gmat_py.py
    gmat_py/_gmat_py.so

3rdparty/GMAT/BUILD:
---
resources(
    name = "so",
    sources = [
        "**/*.so.*",
        "**/*.so",
    ],
)

3rdparty/GMAT/R2020a/BUILD:
---
python_sources(
    sources=["**/*.py"],
    dependencies=[
        "//3rdpart/GMAT:so",
    ]
)

pants.toml:
---
[source]
root_patterns.add = ["/3rdparty/GMAT/R2020a/bin"]

high-yak-85899

01/20/2022, 9:35 PM

Then, I'm able to

from gmatpy import gmat_py as gmat

just fine, but when I got to the

LoadScript

call, I'm back to a segfault. So, for now, I think I'll stick with storing on system for the few cases this is needed and move on to other migration issues we've had.

high-yak-85899

01/20/2022, 9:36 PM

Definitely appreciate all the support effort, though! Much more help than I was expecting with this shot in the dark.

❤️ 1

6 Views

Open in Slack

Previous Next