Hi, how do we feel about migrating `[python-repos]...
# development
h
Hi, how do we feel about migrating
[python-repos].{repos,indexes}
to be
[python].resolves_to_{find_links,indexes
? Specifically, the idea that you can configure it per-resolve (user resolve or tool resolve)
1
reminder: most users won't need to set this, and when they do, they will set via
__default__
key so it applies to every resolve
It seems easy to contrive where this could be useful? For example, generally you are fine using PyPI, including for tool lockfiles. But you have one problematic resolve where you need to force Pex/pip to use your local find_links and not look at PyPI; you don't want that config to leak into all other resolves.
I figured out last week how to implement this. Afaict, major downsides are 1) the ~one day this will take me, and 2) deprecation for users
p
I can see how I might specify an internal pypi (
[python-repos].indexes
) in some cases and even do so per-resolve. And for
[python-repos].repos
that absolutely should be per-resolve - a specialized way to use a local wheelhouse or something for special-purpose resolves. Also, consistency is a big deal. We're adding the per-resolve options for all the other python options. Having a weird thing that uses "legacy" style options is confusing for new users. You have to learn about the history to be able to distinguish when to use which kind of option.
2
Also -
[python-repos].repos
is a very confusing option name - changing that to
find_links
is a superb suggestion.
1
h
Sweet, thanks Jacob for your thoughts. I agree with that all. I'll get started on the implementation so we can see how this looks in practice. The most controversial part is "resolves without a lockfile", e.g.
[GLOBAL].plugins
.
w
as Benjy mentioned in the other thread: maaaaybe worth considering a
resolve(..)
target if we’re getting up to 3+ settings per resolve
(very low context, “coming back from vacation” comment)
h
👍 1
e
So, if I read the initial user asking for this response correctly, it was never actually needed by them.
h
Agreed, given their update a few minutes ago. So the motivation would be anticipatory for the cases Jacob mentions, along with consistency
@enough-analyst-54434, to clarify, what are the downsides of this change in your perspective? What I thought of: https://pantsbuild.slack.com/archives/C0D7TNJHL/p1660577740969439?thread_ts=1660577710.953089&cid=C0D7TNJHL
e
I'll read in a second but not breaking users is the main motivation.
👍 1
Yeah, that's it from your list.
I just generally worry about justification house of cards. Two people can easily turn into an un-morred "we" that implies greater consensus has been reached than it really has. One person + a design doc (considered as a seperate authority somehow) can likewise morph into a "we" and not necessarilty carry the weight of a real representative we. Then with the we established, that triggers more chiming in and you get a self re-inforcing effect that snowballs an opinon into a consensus that is not really one.
👍 1
h
I've found one concrete very awkward edge of
[python-repos]
being global: changing the values for the sake of a single resolve invalidates all lockfiles, because we now track those values in the lockfile header. That means every resolve must now be regenerated, only to use
--find-links
for one resolve! (or, turn off Pants lockfile validation) Another issue: it's not clear to me how I can force a particular resolve to use the local --find-links rather than PyPI. I have a local copy of
ansicolors
and want to force Pants to use it, so I added
[python-repos].repos
. But the lockfile has entries for PyPI and my --find-links. I don't want to set
[python-repos].indexes = []
because then all other resolves won't resolve (These issues are while trying to experiment with local requirements and
--path-mappings
)
1
e
On the force bit, you just can't even when global. Pip does not respect ordering in any way between releases. Zameer noticed this in a Pex thread.
👍 1
h
On the force bit, you just can't even when global. Pip does not respect ordering in any way between releases.
Got it. So only way to force local wheel would be
[python-repos].indexes = []
it sounds like, which is a non-starter because it breaks every other resolve
e
On the global bit, your observation is true but only relevant if this is a common thing. I mean, maybe you get an org like Twitter with 3k git repos (in the past) and they willy-nilly add indexes too, but really - that is just tough.
This should be super rare.
The only way to force using a custom thing is to give it a custom version, use
+foo
when you build it.
☝️ 1
And that's exactly what the
+build id
version extensions are for.
h
The only way to force using a custom thing is to give it a custom version, use +foo when you build it.
or solely use your custom repo / find-links, right? Another option seems to be PEP 440 direct references, which I confirmed indeed works with
--path-mappings
e
Lying about a version is the 1st place to stop.
w
Even then you can have cache pollution.
Yea, as John said.
e
If the find-links is used to provide an artifact that doesn't exist - like a wheel, fine to use the version.
👍 1
If you're hacking, call the shot and use
+hack
Stepping back, does the jvm support indexes per resolve today?
I feel like I'm whacking moles.
h
Sg, thanks for explaining that. So then this isn't a big deal
Another issue: it's not clear to me how I can force a particular resolve to use the local --find-links rather than PyPI.
This one remains very unfortunate imo -- increases the cost to using local requirements
changing the values for the sake of a single resolve invalidates all lockfiles
--
does the jvm support indexes per resolve today?
@witty-crayon-22786 we don't support custom indexes in general, right? Only Coursier
e
COursier is the tool not the index
I think it uses maven central by default
But you can use others.
h
found it:
Copy code
repos = StrListOption(
        default=[
            "<https://maven-central.storage-download.googleapis.com/maven2>",
            "<https://repo1.maven.org/maven2>",
        ],
so, global. the idea of per-resolve config is a new idea to pantsbuild, so I'm not surprised JVM doesn't have it
e
Its a new idea to the universe
1
I have never heard of such a thing
h
as are "resolves" in general, as far as we know
e
Look, this makes sense purely as a gedanken thing. Published artifacts should be immutable or else chaos, If they are the only reason to have multiple indexes is to publish new immutable things. If this is the case then it is correct to have the same set od multiple indexes for all use cases. The only reason not to do this is if there is a perf bug. The only perf bug we know of would be self-inflicted. If you use, say PyPI + a PyPI mirroring index instead of PyPI + supplemental.
So it has 0 to do with language / ecosystem afiact.
And global insdexes always make sense.
Basically you can always permute all the things, but often only a subset makes sense. I think this is one of those cases.
h
this makes sense purely as a gedanken thing.
Pardon, what does "this" specifically mean here?
Published artifacts should be immutable or else chaos
Ack
The only reason not to do this is if there is a perf bug.
And, that it's inconvenient that you may need to only add a supplemental index/--find-links for one resolve, e.g. user code, but doing so invalidates everything else in your repo. For example, you can't use Pants's default tool lockfiles anymore and every tool lockfile must be generated. That's a big adoption cost
p
Publishing artifacts should be immutable. But, some things should have access to supplemental resources, and others should not. If building a wheel for internal distribution, anything in an internal supplemental index is fair game. If building something to go on an external index, like pypi, then that should be restricted so it can only see the publicly accessible artifacts.
👍 1
e
Aha! Ok - thank you for explaining what you're after. I agree with your goal. The question is - is this the right way to implement it. Eric - where you aware this was the goal?
This goal is near and dear to my heart,
I had to contend with this when publishing twitter-commons out from a twitter-private monorepo.
So - @proud-dentist-22844 is it correct to say you could care less about indexes and find links - what you actually care about is publishing valid public things?
And, however that is enforced is fine, but enforcing it automaticallyish is the key bit?
w
If you're publishing things "elsewhere", the published artifacts will need their dependencies constrained to things which are available "elsewhere". Have seen that case before at Twitter
Yea.
e
Yeah, so this is also a 1st party code problem too.
You may have sensitive code that should never be depended on by public things.
w
But until a user actually asks for it, it's still "gedanken"
e
Not sure if visibilities handle that, but I just want to confirm this is the problem you're trying to actually solve @proud-dentist-22844.
p
The constraint on built artifacts is actually a constraint on which dependencies are available, and that should be reflected in the lockfile itself I think, so indexes/find_links is the natural place for that.
e
Its a good problem to solve.
So you plan on publishing the lockfile too @proud-dentist-22844? Or its just annoying / alarming to see an unused index in the lockfile?
And, speaking of goals, my goal here Eric is just to not break users. If a global index and its spelling could be preserved (but undocumented) ~forever, I'd be happy.
I just want to see the swatch of destructions come to end here sometime soon.
h
If a global index and its spelling could be preserved (but undocumented) ~forever, I'd be happy.
Yeah, it can be, as long as we're fine having an extra subsystem
[python-repos]
sitting around forever. My PR handles reading from either the new option or
[python-repos]
We don't yet have technology to "undocument" reference docs though
e
I'm totally fine with that. User burden trumps my burden.
p
With things the way they are, I would insist that devs have to use separate repos for public vs private stuff - ie monorepo is not an option. But, if there's a way to handle that nicely, then I might have a reason to push people to use monorepos in more places. I have a series of repos that are divided between public/private. there are issues other than the repos config that prevent me from unifying them, but stepping toward mono(ish) repos is a very enticing prospect.
👍 1
And if I do publish a lockfile, the internal index should never be visible.
as far as people outside the org know, no such index exists. 🙂
e
Yeah, the fact multiple indexes are in the lockfile is a Pants problem. It should not be wrtiting its headers in the lockfile anyhow. Today it just does this because Pants provides no way to store state. Ok, yeah. I'm completely on board with you there. This is what I tried very hard to tool for bitd. Pants has not had this as an explicit goal yet,
Detaching the header from the lockfile has been discussed.
p
I believe I recommended the header, since comments were allowed in requirements-style files. comments don't exist in JSON, but updating two files atomically is odd. If the header moves to a separate metadata file, then that would probably have to store a hash of the lock file.
👍 1
e
Which is fine. We're into details at this point. Pants should straight up not add data to other tools config files except in the format that tool accepts - somewhat obviously when phrased that way.
The headers truly are a hack.
💯 1
2
h
Alright, so it sounds like I should revive the PR for per-resolve indexes, but this time with `removal_version="3.0"`*?
e
I have no clue how you reached that conclusion! Satisfying the end goal Jacob has does not require it. It probably is the quickest hack though. But as long as the global option is not destroyed I'm happy enough.
h
Sorry, I meant to add a question mark -- I was trying to understand where everyone is at
e
Ok. I called my shot - I classify it as a quick hack. If that's all we have bandwidth for and the global options don't die in 2.x, I'm fine with it.
👍 1
Keep in mind its a public quick hack, so you can't take it back.
You're stuck with the added feature.
h
Any reasons you can anticipate why we might want to take it back? I'm failing to see the downsides of per-resolve config here beyond extra complexity
e
Exactly that - its completely un-needed noise except as a quick hack for this use case that we could serve better and handle for 1st party code as well. If you track various threads you'll notice "extra complexity" oscillates between small problem (here) and evil.
Cognitive overhead is bad, a litle extra complexity is good, etc.
h
Okay, then given that, let's hold off for now. • no one yet actively needs this (although Jacob may in some future) • I've far exceeded my timebox for lockfiles, and landing this will take more time • dangerous to rush the design • PR is very disruptive for plugin authors I can land --path-mappings support without per-resolve indexes, even with the wart about lockfile invalidation @proud-dentist-22844 would you be willing to open an issue describing your goal with private vs public indexes, please?
p
yeah. I'll have slack remind me to do that later. 🙂
❤️ 1
I finally got around to making an issue for this: https://github.com/pantsbuild/pants/issues/17565
❤️ 1