Hi all :wave: Recently found out about the promis...
# general
c
Hi all 👋 Recently found out about the promising looking pants project. excited So far, I've come across one thing that I'm sure there is a good explanation to why it needs to be the way it is, but I can't help feeling it is an itch, that I might need to scratch, if it all possible (without too much effort.) and that is the module_mapping of python requirements. Introduced by this PR: https://github.com/pantsbuild/pants/pull/10025 The thing is, as I'm sure you know already, that the mapping we input here, is already available information from the packages themselves from the top_level.txt file once installed. Can't we use this information from the source, instead of having to manually provide it?
e
We can. The issue is efficiency. So, the context where the module_mapping is used is in dependency inference. Pants is looking at a Python source file, picking out import statements, and determining from those what other files and distributions the Python source file being examined depends on. Right now, with the module_mapping, this can all be done by looking at local files. To infer the module_mapping itself, Pants would need to resolve all distributions declared anywhere in the repo up front, then use that resolve to build a module_mapping.
The one case where we could do this in a fairly unambiguously performant way is when you have a "lockfile" set up as outlined here: https://www.pantsbuild.org/docs/python-third-party-dependencies#using-a-lockfile-strongly-recommended
Without that though, we need to scan every BUILD file for potential
python_requirements
or
python_requirement_library
targets (https://www.pantsbuild.org/docs/reference-python_requirement_library) and resolve those. Performance aside, there are ambiguities this introduces. It may be that you have conflicting requirements that are not meant to be resolved together, To be safe, we'd then need to resolve each requirement seperately or ... there is really no valid way to pick resolve sets, and picking a resolve set requires inferring the dependencies of your python source files. Here we hit a loop.
So, short answer - we could do this for lockfiles I think. Is this something you'd be interested in chipping in support for? We're happy to shepherd new contributors. Either way, it at least seems worthy of tracking with a feature request issue if you don't mind filing an issue over at https://github.com/pantsbuild/pants/issues/new.
h
Thanks for the detailed explainer John! Also, I'd add that you only need an explicit
module_mapping
when the top-level module name is not the same as the distribution name. But in the common case where those are the same, you don't need to mention that in
module_mapping
.
👍 1
Also I think we could provide a default module_mapping for some well-known common cases, to reduce this friction even further. E.g., pkg_resources->setuptools.
👍 2
c
Expertly explained. Thanks. I'll see if I can write up a feature request for it. I think having a db of sorts of well known commonly used packages is a good start.
h
Db of some sorts == a Python dictionary hard coded into Pants, afaict. The users mapping would be merged into that default @curved-television-6568 would you be interested in contributing this? It's not very much code and we can help with instructions. Probably the trickiest part is figuring out what should go into the default. Altho it doesn't need to be exhaustive, it can grow over time
👍 1
c
Yeah, I'll look into it.. I'm writing a issue on github right now 😉
h
Thanks for filing! If you'd like to contribute a patch we can help out. At least defaulting sensibly for some set of well-known cases would be pretty straightforward to do. As Eric said most of the work would be figuring out that set.
c
Yeah, I have a patch ready locally... just trying to build and see what it does before pushing. Also, I've just thrown it together, you'll certainly have feedback on details about naming and placing stuff etc..
❤️ 1
Ok, it built (but I don’t know where it went 😅 ) PR #11635 Included a module_mapping.yaml file for demonstration purposes. Contains a select set of packages that I picked up with the little shell script from one of my python environments (script included in the PR). As been said, I expect we’ll spend the majority of time on populating that file with a sensible set of data. Also, I don’t know how this is best tested, so haven’t even made an attempt at that, yet.
h
Thanks! I'll take a look shortly.
👍 1
b
@curved-television-6568 thank you for contributing to Pants!
🥰 1