Skip to content

fix False positive: bad-typed-dict-key with enums and TypedDict #3365#3372

Draft
asukaminato0721 wants to merge 1 commit into
facebook:mainfrom
asukaminato0721:3365
Draft

fix False positive: bad-typed-dict-key with enums and TypedDict #3365#3372
asukaminato0721 wants to merge 1 commit into
facebook:mainfrom
asukaminato0721:3365

Conversation

@asukaminato0721

Copy link
Copy Markdown
Contributor

Summary

Fixes #3365

Test Plan

@github-actions

Copy link
Copy Markdown

Diff from mypy_primer, showing the effect of this PR on open source code:

openlibrary (https://github.com/internetarchive/openlibrary)
+ ERROR openlibrary/core/bookshelves_events.py:14:16-85: `in` is not supported between `int` and `Generator[Literal[1, 2, 3]]` [unsupported-operation]

rotki (https://github.com/rotki/rotki)
+ ERROR rotkehlchen/db/cache.py:169:16-29: Returned type `((x: Unknown) -> Unknown) | ((value: str) -> Timestamp | None) | ((value: str) -> int | None) | ((value: str) -> tuple[SolanaAddress, SolanaAddress] | None) | ((value: str) -> ChecksumAddress)` is not assignable to declared return type `(str) -> ChecksumAddress | Timestamp | int | None` [bad-return]

apprise (https://github.com/caronc/apprise)
+ ERROR apprise/plugins/jira.py:780:22-782:10: No matching overload found for function `typing.MutableMapping.update` called with arguments: (dict[LiteralString, str]) [no-matching-overload]

core (https://github.com/home-assistant/core)
+ ERROR homeassistant/components/homeassistant_sky_connect/const.py:53:16-46: Object of class `tuple` has no attribute `usb_product_name` [missing-attribute]
@github-actions

Copy link
Copy Markdown

Primer Diff Classification

❌ 4 regression(s) | 4 project(s) total | +4 errors

4 regression(s) across openlibrary, rotki, apprise, core. error kinds: unsupported-operation, False positive bad-return from overly precise enum value resolution, no-matching-overload. caused by enum_value_lookup_on_class(), type_of_own_enum_value().

Project Verdict Changes Error Kinds Root Cause
openlibrary ❌ Regression +1 unsupported-operation pyrefly/lib/alt/class/enums.rs
rotki ❌ Regression +1 False positive bad-return from overly precise enum value resolution enum_value_lookup_on_class()
apprise ❌ Regression +1 no-matching-overload enum_value_lookup_on_class()
core ❌ Regression +1 false positive missing-attribute on enum with dataclass mixin type_of_own_enum_value()
Detailed analysis

❌ Regression (4)

openlibrary (+1)

This is a false positive (regression). The code value in (item.value for item in BookshelfEvent.__members__.values()) is perfectly valid Python — the in operator works on any iterable, including generator expressions. Python's membership test operations (https://docs.python.org/3/reference/expressions.html#membership-test-operations) explicitly state that for general iterables, x in y iterates through y. The PR's changes to enum value type resolution caused .value to resolve to Literal[1, 2, 3] instead of int, making the generator type Generator[Literal[1, 2, 3]]. Pyrefly then incorrectly flags the in operation as unsupported for this generator type. Neither mypy nor pyright flag this.
Attribution: The PR changes enum value type resolution in pyrefly/lib/alt/class/enums.rs. The changes to enum_value_lookup_on_class now compute more specific literal types for enum member values. In the expression (item.value for item in BookshelfEvent.__members__.values()), BookshelfEvent is an IntEnum with members START = 1, UPDATE = 2, FINISH = 3. Previously, .value on an instance of BookshelfEvent likely resolved to int. With the PR's changes (particularly the new logic in enum_value_lookup_on_class that computes union of member value types), .value now resolves to Literal[1, 2, 3]. The generator expression type then becomes Generator[Literal[1, 2, 3]] instead of Generator[int]. Pyrefly then incorrectly reports that in is not supported between int and Generator[Literal[1, 2, 3]]. The root issue is that pyrefly doesn't recognize that generators support the in operator (membership testing via iteration). The more specific enum value type just exposed this pre-existing bug in in operator support for generators.

rotki (+1)

False positive bad-return from overly precise enum value resolution: The PR's changes to enum value type resolution now compute a union of all individual member value types for self.value, producing specific callable types. The declared return type annotation Callable[[str], int | Timestamp | ChecksumEvmAddress | None] is actually incomplete — it doesn't cover tuple[SolanaAddress, SolanaAddress] | None (from _deserialize_solana_token_account_from_str) or str (from the lambda x: x callbacks). So there is a genuine annotation inconsistency in the source code. However, this was previously invisible because type checkers didn't compute the precise union of all enum member value types. Neither mypy nor pyright flags this. The PR's new enum value resolution surfaces this pre-existing issue, making it a practical false positive that users cannot easily resolve without restructuring their enum.

Overall: The DBCacheDynamic enum has members whose values are 2-tuples. self.value[1] accesses the second element (the deserializer callback). The declared return type is Callable[[str], int | Timestamp | ChecksumEvmAddress | None]. The PR's changes to enum value resolution now compute the type of self.value as a union of all individual member value types. Since each member has a different callback (with different signatures), self.value[1] resolves to a union of all those specific callable types.

Looking at the error message carefully, the returned type union includes:

  • (x: Unknown) -> Unknown (from lambda x: x entries)
  • (value: str) -> Timestamp | None
  • (value: str) -> int | None
  • (value: str) -> tuple[SolanaAddress, SolanaAddress] | None (from _deserialize_solana_token_account_from_str)
  • (value: str) -> ChecksumAddress (from string_to_evm_address)

The declared return type is Callable[[str], int | Timestamp | ChecksumEvmAddress | None] (which the error message confirms resolves to (str) -> ChecksumAddress | Timestamp | int | None, confirming ChecksumEvmAddress is indeed an alias for ChecksumAddress).

Notably, the declared return type annotation is actually incomplete: _deserialize_solana_token_account_from_str returns tuple[SolanaAddress, SolanaAddress] | None, which is NOT a subtype of int | Timestamp | ChecksumEvmAddress | None. Similarly, the lambda x: x callbacks effectively return str (since the input is str), which is also not in the declared return type. So the return type annotation in the source code is genuinely incorrect/incomplete.

However, this type inconsistency was previously invisible because older pyrefly (and mypy/pyright) did not compute the precise union of all enum member value types for self.value. The PR's changes to how enum .value types are computed on instance access now surface this pre-existing annotation issue. Neither mypy nor pyright flags this because they handle enum .value types differently (typically not computing the full union of member value types in this way).

This is a new error caused by the PR's changes to enum value resolution. While the underlying annotation is technically incomplete, the error is a practical false positive in the sense that the code works correctly at runtime and other type checkers don't flag it. The overly precise enum value type computation creates errors that are impractical for users to resolve without changing their enum patterns.

Attribution: The PR changes how enum .value is resolved in pyrefly/lib/alt/class/enums.rs and pyrefly/lib/alt/attr.rs. Specifically, the changes to enum_value_lookup_on_class() now compute a union of individual member value types instead of using the mixed-in type. For DBCacheDynamic, which stores tuples as values, self.value[1] now resolves to a union of the specific callable types of each member's second tuple element (the various deserializer functions and lambdas), rather than a more general type. The new get_enum_or_instance_attribute path in attr.rs (line 1841-1843) is invoked when accessing .value on an enum self-type, which changes how the type of self.value is computed inside the deserialize_callback property.

apprise (+1)

The PR changed enum value type inference to return more specific literal types. When iterating over self.mapping (whose keys are NotifyType enum members), k.value now gets a LiteralString type. The dict comprehension {':...'.format(k.value): v} produces dict[LiteralString, str]. Calling params.update() on a dict[str, str] with dict[LiteralString, str] should succeed since LiteralString <: str, but pyrefly's overload resolution incorrectly rejects it. This is a false positive — the code is correct and neither mypy nor pyright flag it.
Attribution: The changes to enum_value_lookup_on_class() in pyrefly/lib/alt/class/enums.rs now return more specific literal types for enum .value access on class instances. Previously k.value (where k: NotifyType) would resolve to str, but now it resolves to a LiteralString or union of literals. This more specific type then causes the overload resolution for MutableMapping.update to fail to find a matching overload, producing the false positive no-matching-overload error.

core (+1)

false positive missing-attribute on enum with dataclass mixin: Pyrefly incorrectly resolves HardwareVariant.value as tuple instead of VariantInfo. The enum uses a dataclass mixin pattern where the tuple arguments are passed to the dataclass constructor, making .value a VariantInfo instance. The .usb_product_name attribute access is valid.

Overall: This is a regression. The HardwareVariant class inherits from VariantInfo (a frozen dataclass) and enum.Enum. When you define SKYCONNECT = ("SkyConnect v1.0", "SkyConnect", "Home Assistant SkyConnect"), Python's enum machinery passes these arguments to VariantInfo.__init__, so each member's .value is a VariantInfo instance with usb_product_name, short_name, and full_name attributes. The code variant.value.usb_product_name is perfectly valid at runtime. Pyrefly's new enum value resolution logic incorrectly infers the .value type as tuple (the raw literal tuple) rather than VariantInfo, causing a false positive missing-attribute error. Neither mypy nor pyright flag this.

Attribution: The PR changed how enum .value types are resolved in pyrefly/lib/alt/class/enums.rs and pyrefly/lib/alt/attr.rs. Specifically, the new type_of_own_enum_value() function in enums.rs and the changes to enum_value_lookup_on_class() altered how the value type is computed. The change to AttributeBase1::SelfType in attr.rs now routes enum value/_value_ lookups through get_enum_or_instance_attribute(). For HardwareVariant, which uses a dataclass mixin (VariantInfo), the new logic appears to resolve .value as tuple (the raw tuple literal type of the enum member arguments like ("SkyConnect v1.0", "SkyConnect", "Home Assistant SkyConnect")) instead of recognizing that the mixin's __new__ constructs a VariantInfo instance from those arguments. This causes the subsequent .usb_product_name access to fail with missing-attribute on tuple.

Suggested fixes

Summary: The PR's changes to enum value type resolution in enums.rs cause 4 pyrefly-only regressions: overly precise literal types break generator/dict operations, incorrect tuple type for dataclass mixin enums, and overly specific callable unions cause bad-return errors.

1. In enum_value_lookup_on_class() in pyrefly/lib/alt/class/enums.rs, when there is a mixed_in type and the enum value types don't all match the mixin after promotion, fall back to returning the mixed_in type instead of computing a union. Currently (lines ~295-310), when a promoted value_ty doesn't equal mixed_in, the code substitutes mixed_in into the union — but this still produces overly specific types in some cases (e.g., LiteralString instead of str). The fix: when mixed_in is present, simply return mixed_in as was done before the PR (if let Some(mixed_in) = mixed_in { return mixed_in; }). This restores the previous behavior for mixin enums like IntEnum and str+Enum, where .value on an instance should resolve to the mixin type (int, str) rather than a union of literals.

Files: pyrefly/lib/alt/class/enums.rs
Confidence: high
Affected projects: openlibrary, apprise, rotki
Fixes: bad-return, no-matching-overload, unsupported-operand
The openlibrary regression occurs because IntEnum .value now resolves to Literal[1, 2, 3] instead of int, causing generator membership test to fail. The apprise regression occurs because str+Enum .value resolves to LiteralString instead of str, causing dict type mismatch in update(). The rotki regression occurs because the union of specific callable types from tuple enum values doesn't match the declared return type. Restoring the mixed_in fallback for instance access (not literal member access) would fix all three mixin-based enum regressions. The test changes in the PR show the intentional change: assert_type(foo.value, Literal["x"]) instead of assert_type(foo.value, str) — but this overly precise inference causes downstream false positives.

2. In type_of_own_enum_value() in pyrefly/lib/alt/class/enums.rs (lines ~499-505), add a guard to skip returning the raw field type when the enum has a dataclass or other non-trivial mixin that constructs instances from tuple arguments. Specifically, when the field type is a tuple but the enum has a mixin class (from enum_metadata) that is a dataclass, the .value type should be the mixin type, not the raw tuple. The fix: check if the enum has a data mixin class, and if so, return None from type_of_own_enum_value() so that the mixin type is used instead. For example: if enum_.mixed_in_type.[is_some()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/class/enums.rs) { return None; } at the top of the function, or more precisely, check if the field type is a tuple and the mixin would construct a different type from those arguments.

Files: pyrefly/lib/alt/class/enums.rs
Confidence: high
Affected projects: core
Fixes: missing-attribute
The core regression occurs because HardwareVariant inherits from VariantInfo (a frozen dataclass) and enum.Enum. The enum members are defined as tuples like SKYCONNECT = ('SkyConnect v1.0', 'SkyConnect', 'Home Assistant SkyConnect'), but Python's enum machinery passes these to VariantInfo.init, so .value is a VariantInfo instance, not a tuple. The new type_of_own_enum_value() function finds the raw value field (which has tuple type from the assignment) and returns it, bypassing the mixin-aware logic. By not returning the raw field type when a mixin is present, the code would fall through to the mixin-based resolution which correctly returns VariantInfo.


Was this helpful? React with 👍 or 👎

Classification by primer-classifier (4 LLM)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant