Hey, I'm new to pants and exploring it. I have som...
# general
m
Hey, I'm new to pants and exploring it. I have some novice doubts to clear. I have a multi-language monorepo. I have few python modules in it. By default, they are sharing one common set of requirements file. However, in a module (let's say: ModuleX) I want to use a different requirements file. So, for this I'm thinking of using resolvers. However, ModuleX is also using some other module using the default requirements file. When doing pants run on ModuleX, I'm seeing that it's using the default resolver. And if I specify the resolver in that modules build file, it's not picking up the other module. I wanted to know about the best practices around this and how do I tell pants that ModuleX is dependent on ModuleY which is on same level. Folder structure is like: src/python/moduleX src/python/moduleX/requirements src/python/requirements src/python/moduleY
e
Could you give us some more information about the dependencies between your modules? If ModuleX is using a different requirements file, but ModuleX is also importing another module, then what is really happening is: • An environment containing a python interpreter exists • ModuleX has dependencies on `ModuleX/requirements`which implies it can only be used in an environment that contains these dependencies • running
python ModuleX
assumes it is running in an environment that has these dependencies (else weird failures happen when ModuleX tries to import them) ◦ Once its set up, pants will take care of building this environment for you • If/when moduleX tries to import moduleY, it will do so within the current environment (because that is where the python interpreter exists) From the description you've given us, it seems like what you really have is a bunch of independent modules (great), that all make up one "application/program/package/whatever". Requirements should be specified at the level of the application, because all the requirements must exist in the environment for the application as a whole to function. Please let us know if this matches your situation or if there are some more details to consider ◦
m
Yes it matches my situation. Normally, every module is independent and meant to be used separately only. However, in some cases one module can import another. I'm not looking into packaging everything together in a large application, but packaging each small module.
As these modules will be pretty light in itself, do not want to create different repos. Also, most of the modules will share the exact same dependencies. Hence, do not want to copy paste it on each module either.
e
Are you packaging each small module as python distributions (eg. to upload to pypi) or building and publishing docker containers, or looking to create "standalone" executables such as pex binaries?
m
Docker containers
e
great!
My advice then: (and feel free to complain about/change something that doesn't feel right for your situation) • Use one "global"/"universal" set of requirements ◦ This will ensure that all modules use the same version of any packages that are in use (which is nice for debugging purposes) ◦ It makes it easy to export a virtualenvironment to inspect/use for anything that pants cna't do for you. ◦ Its also just the simplest • Pants' dependency inference is really smart. You can build a `pex_binary`by specifying a (python) entry point file, and the pex binary will include every source file that is imported (transitively) by that entry point, and every 3rd party package (requirement) that is used by any of that source, and nothing extra. • Then you can copy this pex binary into a docker image (there's pretty good docs and/or blog posts for this) ◦ Dockerfile is simple (basically one copy instruction, and no need to worry about setting up the requirements yourself)) ◦ resulting image is fairly well reduced to a minimal size This is what I've been doing and it works pretty well
m
Ah got your point. That's a nice approach. However it's a very new project and codebase is changing very frequently. And do not want to maintain multiple container and binary revisions in artifact registries. How I was planning to deploy is, build a container and pip install from all the needed requirements files. As we're on K8s, inject the python script either using init container or some volume mount. This way less builds to happen and manage as new images only gets created on dependency changes and not code changes.
f
You can also put one of these modules into multiple resolves.
resolve=parametrize("resolve-foo", "resolve-bar")
m
Oh! We can use multiple resolvers also! Haven't realised.
f
The
parametrize
mechanism creates multiple targets from a single target. When applied to the
resolve
field, you will get the same original target in multiple resolves.
Pants knows how to select the correct resolve for the `parametrize`d target if some other target is in a single resolve.
Here is a question though. Do the requirements conflict between these resolves?
You could just use a single resolve with a superset of all of the modules' requirements. Pants will only include the dependencies actually used when building a
pex_binary
or other target.
I.e., Pants will subset the resolve's requirements to only those actually used.
m
We can assume that there's no conflict for now. Because, in case of conflict I don't think there's any way to handle even if it's in a single requirements file.
Not in a favour of building pex files, will consider if there's no other elegant way though.
f
I just used pex files as an example. The subsetting is independent of that.
So when you run tests for example, Pants will subset then as well.
Regardless, I recommend to just use a single resolve to start with the superset of module requirements and see if that works for you.
m
Ah got it now.