Is there anyone who loves fixing Python typing iss...
# development
w
Is there anyone who loves fixing Python typing issues? If so, I'd be wildly appreciative of anyone who wants to tackle typing non-bare
@rule
s
This works great
Call-by-name migration with all the types hidden is a nightmare, or I suppose... a daymare
p
Yeah. I've done something like this. The migration makes which rule to use more explicit, but the typing of
@rule
is very complex, so pycharm (and VSCode? It looks like your screenshot is of VSCode) have a hard time inferring the types. So, I end up looking up the rule and importing the rule's result to use in a type hint - at least while I'm developing. Sometimes, I then remove it to make mypy happy because the type checkers don't agree on inference logic.
w
Yeah, trying to avoid that mental/keyboard overhead - as there are a LOT more backends to migrate
I think ParamSpec will be my friend here, as that seems to be what is good for typing decorators. I've never needed to use it, but 🤷 always a first time
p
Ooh. There's hope? Nice!
w
I'll report back in emojis
😄 1
Copy code
F = TypeVar('F', bound=Callable[..., Any])
@overload
def rule(func: F, /) -> F: ...

@overload
def rule(*, desc: str = "", level: LogLevel = LogLevel.DEBUG) -> Callable[[F], F]: ...
Sorta works
Here's something I didn't clue into - how are these rules not async def?
p
I believe there is some magic in
@rule
that allows the function to be either sync or async.
Requiring async might simplify some of the typing there
But that might have backwards compatibility issues
h
Yeah, the
async
is purely informational, since we’re not executing rules in a python event loop
But it would be fine to require
async
if it helps with typing
w
I’ll give it a shot - it feels just wrong to allow either async or no async, and hide that behind complicated typing. I stripped down the typing to whatever the simplest thing was, and then all these async errors popped up (and some others, which I need to fix). But the number of red squiggles I was able to make disappear was pretty impressive. I think once I can handle a generic version too - we’re in a pretty good place.
the
async
is purely informational,
But, we still require
await
for call-by-name right? At the very least due to the trampoline back into rust code and using the tokio executor? Would that imply that not having
async
is kinda technically incorrect? Or does it just get wrapped and fired off as a co-routine regardless of whether it's considered a Python coroutine or not
Okay, yeah, it's basically here:
Copy code
def rule_decorator(func: SyncRuleT | AsyncRuleT, **kwargs) -> AsyncRuleT:
Ah, so lame - PyRight likes the typing I have, but MyPy struggles
🫣 1
p
Vscode uses pyright, right?
So there are 3 type checkers that need to agree: mypy pyright PyCharm
😅
w
Oh man, pycharm - forgot
Im hoping maybe just the mypy we use is old, and an upgrade fixes everything - because I'm literally using what mypy recommends
🙏 1
Nope... Womp womp
Oh good lord. Order matters in mypy with overloads apparently...
Alright, I think the remaining 30 mypy errors are now ACTUALLY errors that got uncovered with proper typing of rules… I guess we also have warnings as errors on, because I’m getting redundant cast warnings and things
@happy-kitchen-89482 @fast-nail-55400 This was early in my Pants time (or before). What should be the intended type for _masked_types?
Copy code
@rule(desc="Find targets from input specs", level=LogLevel.DEBUG, _masked_types=[EnvironmentName])
_masked_types: tuple[type, ...]
fails with
No overload variant of "rule" matches argument type "list[type[EnvironmentName]]"
Do we particularly care that this is a tuple, per se? Or just any iterable (as it suggests downstream)? I can change the callsites to tuples, or just expand the typing to Iterable to fix this - but 🤷
f
I didn't even know
_masked_types
was a thing until now.
Copy code
masked_types: tuple[type, ...] = tuple(kwargs.get("_masked_types", ()))
w
Yeah, usually used for "[EnvironmentNames]
f
It gets converted to tuple, so maybe
Iterable
is appropriate?
w
That's what I have for now, just wasn't sure if I should go through the codebase and tuple-ize
f
Is there a performance benefit from avoiding list -> tuple conversions? Seems like a rainy day refactor more than anything.
i.e., seems fine as is for now
w
I mean, I'm already digging through this - there might be some, but I'm more wondering if using a mutable type vs a type that can't change would make a difference, as it does while running the rules
f
🤷
could be, guess we'd have to measure to truly know
(but not suggesting any measurements, just a thought)
w
👍 Just changing the incoming type then, saves me a hassle
Leaving this in draft a bit, while I do some "before"/"after" pyright checking, but so far, so much better
h
I have no idea what
_masked_types
even is
First I’m seeing it
w
Whelp. I'm glad none of us really know what it is 🙂
Allows callers to prevent the given list of types from being included in the identity of
a @rule
h
And callers would want to do that… why?
w
shrug
I haven't dug into usages, I just think they're mostly
EnvironmentName
- and I'm not in the know about Environment stuff
Copy code
⏺ pants.git/call-by-name-shell % rg _masked_types                                                                                                                                                                                                                                                                                                   ⎇ call-by-name-shell*
src/python/pants/engine/rules.py
193:    "_masked_types",
214:    _masked_types: NotRequired[Iterable[type[Any]]]
252:    masked_types: tuple[type, ...] = tuple(kwargs.get("_masked_types", ()))

src/python/pants/engine/internals/specs_rules.py
130:@rule(_masked_types=[EnvironmentName])
157:@rule(_masked_types=[EnvironmentName])
225:@rule(_masked_types=[EnvironmentName])
246:@rule(_masked_types=[EnvironmentName])
256:@rule(_masked_types=[EnvironmentName])
268:@rule(desc="Find targets from input specs", level=LogLevel.DEBUG, _masked_types=[EnvironmentName])
279:@rule(_masked_types=[EnvironmentName])

src/python/pants/engine/internals/graph.py
114:@rule(_masked_types=[EnvironmentName])
521:@rule(_masked_types=[EnvironmentName])
593:@rule(_masked_types=[EnvironmentName])
638:@rule(desc="Find all targets in the project", level=LogLevel.DEBUG, _masked_types=[EnvironmentName])
653:    _masked_types=[EnvironmentName],
808:@rule(desc="Resolve transitive targets", level=LogLevel.DEBUG, _masked_types=[EnvironmentName])
881:@rule(_masked_types=[EnvironmentName])
886:@rule(desc="Resolve coarsened targets", level=LogLevel.DEBUG, _masked_types=[EnvironmentName])
1077:@rule(desc="Find which targets own certain files", _masked_types=[EnvironmentName])
1494:@rule(desc="Resolve direct dependencies of target", _masked_types=[EnvironmentName])
w
callers would want to do that because they do not want their identities to ever accidentally be dependent on the environment.
for example: the build graph: as it stood, it felt reasonable to never allow the build-graph to be re-computed per-environment, as it could lead to whacky non-determinism of goals like
pants dependencies
, etc depending on which environment you were running them in. obviously the rest of the graph might change, but preventing it in the build graph "felt" like a good idea. who knows!
the build-graph masks any inbound environment, and instead always explicitly runs itself in the local environment.
w
Ah, interesting. I would have thought the environment would trigger a re-build of the graph - but I guess I never gave it more than a cursory thought
w
well: there are (at least?) two graphs: the build-graph is the graph produced by rules in
graph.py
, and rendered using the introspection goals...
dependencies
,
list
, etc. the "graph graph" is the graph of rule invocations
the former thing is masked... the latter thing is definitely mostly re-computed per-environment, depending on which subgraphs actually consume the environment