If hypothetically I were a lunatic and wanted to invoke pexe Pants #general

If, hypothetically, I were a lunatic, and wanted t...

average-breakfast-91545

10/17/2024, 10:03 PM

If, hypothetically, I were a lunatic, and wanted to invoke pexes from my python program, is there a better way than just shelling out to them? The use case is in ML, where we want to version a model, it's data transforms, and assorted dependencies together. One way to do that would be to build a pex as the dynamically invoked artifact.

happy-kitchen-89482

10/18/2024, 1:27 PM

Shelling out seems about right. You can’t import from a pex and execute in-process. Is the overhead of interpreter startup prohibitive in this case?

average-breakfast-91545

10/18/2024, 1:48 PM

No clue yet, maybe. I'm still in "is this an insane idea" kinda space.

happy-kitchen-89482

10/18/2024, 1:51 PM

Doesn’t seem insane at first blush. How many pexes are we talking about, and how large would they be, roughly? Do thery have many overlapping dependencies and/or data?

happy-kitchen-89482

10/18/2024, 1:51 PM

Because you can invoke the same pex with multiple different entry points, if that ends up making sense

happy-kitchen-89482

10/18/2024, 1:52 PM

I.e., building one big swiss-army-knife pex and invoking it via different entry points

average-breakfast-91545

10/18/2024, 1:56 PM

They may have different deps, is the issue. Context: I work for a company that runs ml models. At the moment, those run in a lambda function, one entrypoint, many models. The training process is immature, and i want to fix that, with an automated training and evaluation pipeline. At the moment, the model is stored as a python pickled object, which has dependencies on things like sci-kit learn. This is bad. we are extending the models to include some more comprehensive data transforms, because that makes it easier to version the data and the model together. Ergo, in an ideal world, I would have a deployable artifact containing a version of a model, that I can invoke with a standard form of data, that will transform and run over the data, returning a standard result shape. I could put each model into a docker container or something, but then running one lambda, many docker images isn't gonna word (serverlss Docker in Docker anyone?), or I could set up some piece of infra that schedules the execution of dynamic lambdas, using the latest versioned docker image for each model, or I could deploy a docker cluster of some kind, or I could fetch a pex inside my lambda function, and invoke that.

average-breakfast-91545

10/18/2024, 1:57 PM

The reason the deps may differ is that the latest version of predict-foo may have been trained at a later time than the latest version of predict-bar.

average-breakfast-91545

10/18/2024, 1:57 PM

Which is, to be clear, an issue with the current pickling approach.

gorgeous-winter-99296

10/21/2024, 9:20 AM

It's a bit orthogonal, but you might get some mileage from PEX_PATH as well. I have used it with some success to split large PEX's into smaller parts that ship at lower cadence. (Also; re your ideal world... Look at OCI artifacts! https://oras.land/docs/concepts/artifact/ . I've not had a chance to use it in production, as I'm primarily involved in RL infra & tools.)

👀 1

average-breakfast-91545

10/29/2024, 9:48 AM

Just pulling on this thread a little more, because I happened to stumble across KitOps the day after this conversation. If you're working in an RL domain, (assuming RL is reinforcement learning) what are you using atm to package and version models?

gorgeous-winter-99296

10/29/2024, 9:53 AM

We don't really RL models (or RL, really) as a thing on their own anywhere, no more than we ship individual textures, so we do the same as we do for textures. We have our custom cloud system built in Go which manages the training lifecycle. Artifacts go to a filestore instance (soon~ GCS), and are tied to lifetime of the training run resource. All ONNX models from this system contain whole config + back-reference to resource IDs in the cloud system. The ONNXs are imported like any other assets into our VCS, and then imported as game assets. That import step converts the ONNX to NNEF, applies some other optimizations, and extracts all metadata to the asset level.

gorgeous-winter-99296

10/29/2024, 9:56 AM

And all our RL is inside the game! So whatever is in the game is the version. Switch version of asset, new version of game, etc.

👍 1

gorgeous-winter-99296

10/29/2024, 9:59 AM

(Our models, just for reference, are on the scale of 10 MB.)

average-breakfast-91545

10/29/2024, 10:01 AM

What game are you working on? Sheer idle curiosity

gorgeous-winter-99296

10/29/2024, 10:05 AM

https://arcraiders.com/ is the one I work on, but not out yet. https://www.reachthefinals.com/ is out though https://vimeo.com/690910430 https://vimeo.com/516708579

👍 1

6 Views

Open in Slack

Previous Next