Hi all, I’m trying to create a `pex_binary` target...
# general
f
Hi all, I’m trying to create a
pex_binary
target with only the dependencies (including first-party dependencies) of a particular target (actually, a
python_sources
target generator). Is there a way I can tell pants to do this with the
dependencies
field of the
pex_binary
target? The use case is that I have some workflow code which is uploaded to my Docker image by a separate tool (Flyte), so I only want my pex that I’m unpacking into a venv to include 3rd party and 1st party dependency code. Thanks!
w
unless i’m missing something, you should be able to accomplish that by including the
python_sources
target in the
dependencies
list of the
pex_binary
?
f
That then includes the generated
python_source
targets in the pex. So if my directory structure is:
Copy code
.
├── src
│   ├── cmi_db
│   └── cmi_orchestration
└── pants.toml
My
pex_binary
should have code from 3rd party and
cmi_db
in it, but not
cmi_orchestration
, where code in
cmi_orchestration
imports code from
cmi_db
(and not vice-versa), so pants can infer the first-party dependency
g
This sounds very close to the docker optimization described here where you split first-party/third-party code into two different stages: https://blog.pantsbuild.org/optimizing-python-docker-deploys-using-pants/
f
Yep, that’s the pattern I’m following, except if I list
src/cmi_orchestration
as the
pex_binary
dependency
, then that code is also included in the
srcs
. What I want is all the dependencies of
src/cmi_orchestration
, without having to list them manually
w
ah, i see what you’re saying. only the dependencies, but not the target itself.
f
Yes, precisely 🙂
w
i’m not aware of a way to do that via target declarations. but is it possible to refactor the code to split the portion which has the dependencies you’d like to package, from the other portion which you don’t want to package?
i.e., split them into separate modules
f
They are already,
src/cmi_db
is the first party code upon which the
src/cmi_orchestration
depends, so pants can infer the dependency, and
src/cmi_orchestration
is the code uploaded by Flyte into the container. I could list
src/cmi_db
in the
dependencies
of the
pex_binary
but of course there are many more first-party packages than
src/cmi_db
that are dependencies of
src/cmi_orchestration
🙂 So listing them all manually is feasible, but likely to go out-of-date as the code in
src/cmi_orchestration
changes, and since pants already knows about the dependencies, I was hoping there’d be a way to specify this
g
What I've done for some of our plugin workflows is create a "dummy" python file that is just used to ensure the correct extra deps get pulled in at build time.
👍 1
So e.g. in
cmd/app1.py
I have
Copy code
import plugin1 # noqa: for dep inference
import plugin2 # noqa: ...
import app

app.run()
f
Oh, that’s an interesting idea. Thanks @gorgeous-winter-99296 and @witty-crayon-22786 for the quick responses!
🙇 1
g
One note to be aware of that for 1st-party code, you're getting a transitive but explicit closure, not the package. So you'll need to ensure all files you need at runtime are referenced by
import plugin1
f.ex... If that is just an empty
__init__.py
you'll likely get less than you expect.
Copy code
pants dependencies --dependencies-transitive cmd/app1.py
Will list that whole closure.