Skip to content

fix special-case NotImplementedType in binop signatures to match runtime semantics #1129#2677

Open
asukaminato0721 wants to merge 1 commit into
facebook:mainfrom
asukaminato0721:1129
Open

fix special-case NotImplementedType in binop signatures to match runtime semantics #1129#2677
asukaminato0721 wants to merge 1 commit into
facebook:mainfrom
asukaminato0721:1129

Conversation

@asukaminato0721

Copy link
Copy Markdown
Contributor

Summary

Fixes #1129

fixed the binop resolution path so NotImplementedType no longer leaks as a possible operator result.

successful dunder calls now strip only the exact NotImplementedType branch, keep searching reflected dunders when needed, and union any concrete results that can actually occur at runtime.

Test Plan

a regression test covering both pure fallback and mixed `int | NotImplementedType behavior.

@meta-cla meta-cla Bot added the cla signed label Mar 5, 2026
@asukaminato0721 asukaminato0721 marked this pull request as ready for review March 5, 2026 23:27
Copilot AI review requested due to automatic review settings March 5, 2026 23:27

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes issue #1129, where binary operations involving dunder methods that return NotImplementedType would incorrectly leak the NotImplementedType into the inferred result type instead of following Python's runtime semantics (trying the reflected dunder when the forward one signals NotImplemented).

Changes:

  • The try_binop_calls logic in operators.rs is updated to strip NotImplementedType from successful dunder return types, accumulate non-NotImplementedType results, and continue searching for reflected dunders when needed.
  • NotImplementedType is added as a new stdlib entry in Stdlib, gated to Python ≥ 3.10 (consistent with how EllipsisType is handled).
  • A regression test is added covering both the pure-NotImplementedType fallback and the mixed int | NotImplementedType scenario.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pyrefly/lib/alt/operators.rs Core fix: strips NotImplementedType from successful dunder results, accumulates partial results, and continues to reflected dunders when needed
crates/pyrefly_types/src/stdlib.rs Adds NotImplementedType as an Option<StdlibResult<ClassType>> stdlib entry, guarded by version >= 3.10
pyrefly/lib/test/operators.rs Regression test covering the bug from issue #1129 for both pure and mixed NotImplementedType return types

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 204 to +207
errors.extend(callee_errors);
return ret;
if ret_without_not_implemented != ret {
successful_ret = self.union(successful_ret, ret_without_not_implemented);
continue;

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a dunder returns int | NotImplementedType, the code extends errors with callee_errors at line 204 and then continues to look for a reflected dunder. If callee_errors is non-empty (e.g., because the method is not callable as a call target), those errors are emitted unconditionally — even if the reflected dunder later succeeds cleanly. This results in spurious error reporting.

The callee_errors extension should be deferred: accumulate them alongside successful_ret and only emit them at the return site (line 209 or line 215), similar to how first_call defers error emission until it is determined whether the call is ultimately the "best" result.

Copilot uses AI. Check for mistakes.
}),
None => ret.clone(),
};
if ret_without_not_implemented.is_never() {

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When all dunder methods exist and have no call errors, but all return only NotImplementedType (so ret_without_not_implemented.is_never() is true for every iteration), the code falls through to the "Cannot find __add__ or __radd__" error message. This message is misleading: the methods do exist, they just always return NotImplemented. A more accurate message could indicate that all matching operator methods always return NotImplemented.

Note: this scenario only arises in uncommon code where every dunder is annotated to always return NotImplementedType.

Suggested change
if ret_without_not_implemented.is_never() {
if ret_without_not_implemented.is_never() {
// All branches of this dunder call resolved to NotImplementedType.
// Record this call as the first attempted call (if none recorded yet)
// so that later error handling can distinguish "methods exist but
// always return NotImplemented" from "no dunder methods found".
if first_call.is_none() {
first_call = Some((callee_errors, call_errors, ret));
}
Copilot uses AI. Check for mistakes.
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@asukaminato0721 asukaminato0721 marked this pull request as ready for review May 5, 2026 04:35
@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown

Diff from mypy_primer, showing the effect of this PR on open source code:

freqtrade (https://github.com/freqtrade/freqtrade)
+ ERROR freqtrade/data/converter/orderflow.py:175:21-76: `-` is not supported between `date` and `Timestamp` [unsupported-operation]
+ ERROR freqtrade/data/converter/orderflow.py:175:21-76: `-` is not supported between `date` and `datetime` [unsupported-operation]

pandas-stubs (https://github.com/pandas-dev/pandas-stubs)
+ ERROR tests/indexes/bool/test_mul.py:93:20-37: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/bool/test_mul.py:94:20-37: assert_type(ndarray[tuple[Any, ...], dtype[timedelta64]], Never) failed [assert-type]
+ ERROR tests/indexes/bool/test_sub.py:61:20-37: assert_type(Unknown, Never) failed [assert-type]
+ ERROR tests/indexes/bool/test_sub.py:61:21-29: `-` is not supported between `Index[builtins.bool]` and `ndarray[tuple[Any, ...], dtype[numpy.bool]]` [unsupported-operation]
+ ERROR tests/indexes/bool/test_sub.py:70:20-37: assert_type(Unknown, Never) failed [assert-type]
+ ERROR tests/indexes/bool/test_sub.py:70:21-29: `-` is not supported between `ndarray[tuple[Any, ...], dtype[numpy.bool]]` and `Index[builtins.bool]` [unsupported-operation]
+ ERROR tests/indexes/bool/test_truediv.py:70:20-37: assert_type(ndarray[tuple[Any, ...], dtype[float64]], Never) failed [assert-type]
+ ERROR tests/indexes/complex/test_sub.py:63:26-46: assert_type(Index[complex], NoReturn) failed [assert-type]
+ ERROR tests/indexes/float/test_sub.py:63:26-46: assert_type(Index[float], NoReturn) failed [assert-type]
+ ERROR tests/indexes/float/test_truediv.py:88:20-37: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/float/test_truediv.py:89:20-37: assert_type(ndarray[tuple[Any, ...], dtype[float64]], Never) failed [assert-type]
+ ERROR tests/indexes/int/test_floordiv.py:90:20-38: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/int/test_floordiv.py:91:20-38: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/int/test_floordiv.py:92:20-38: assert_type(ndarray[tuple[Any, ...], dtype[signedinteger[_64Bit]]], Never) failed [assert-type]
+ ERROR tests/indexes/int/test_sub.py:63:26-46: assert_type(Index[int], NoReturn) failed [assert-type]
+ ERROR tests/indexes/int/test_truediv.py:88:20-37: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/int/test_truediv.py:89:20-37: assert_type(ndarray[tuple[Any, ...], dtype[float64]], Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_floordiv.py:86:20-38: assert_type(ndarray[tuple[Any, ...], dtype[signedinteger[_8Bit]]], Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_floordiv.py:90:20-38: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_floordiv.py:91:20-38: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_mul.py:74:20-37: assert_type(ndarray[tuple[Any, ...], dtype[numpy.bool]], Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_mul.py:78:20-37: assert_type(ndarray[tuple[Any, ...], dtype[complex128]], Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_truediv.py:85:20-37: assert_type(ndarray[tuple[Any, ...], dtype[float64]], Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_truediv.py:89:20-37: assert_type(ndarray[tuple[Any, ...], dtype[complex128]], Never) failed [assert-type]
+ ERROR tests/indexes/timedeltaindex/test_truediv.py:90:20-37: assert_type(Any, Never) failed [assert-type]
+ ERROR tests/series/bool/test_sub.py:89:20-37: assert_type(Unknown, Never) failed [assert-type]
+ ERROR tests/series/bool/test_sub.py:89:21-29: `-` is not supported between `Series[builtins.bool]` and `ndarray[tuple[Any, ...], dtype[numpy.bool]]` [unsupported-operation]
+ ERROR tests/series/bool/test_sub.py:98:20-37: assert_type(Unknown, Never) failed [assert-type]
+ ERROR tests/series/bool/test_sub.py:98:21-29: `-` is not supported between `ndarray[tuple[Any, ...], dtype[numpy.bool]]` and `Series[builtins.bool]` [unsupported-operation]
+ ERROR tests/series/complex/test_sub.py:98:26-46: assert_type(Series[complex], NoReturn) failed [assert-type]
+ ERROR tests/series/float/test_sub.py:86:26-46: assert_type(Series[float], NoReturn) failed [assert-type]
+ ERROR tests/series/int/test_sub.py:86:26-46: assert_type(Series[int], NoReturn) failed [assert-type]
+ ERROR tests/series/timedelta/test_sub.py:112:20-37: assert_type(ndarray[tuple[Any, ...], dtype[timedelta64]], Never) failed [assert-type]

pandas (https://github.com/pandas-dev/pandas)
+ ERROR pandas/core/indexes/interval.py:1509:32-43: `-` is not supported between `date` and `Timestamp` [unsupported-operation]
+ ERROR pandas/core/indexes/interval.py:1509:32-43: `-` is not supported between `date` and `datetime` [unsupported-operation]

werkzeug (https://github.com/pallets/werkzeug)
+ ERROR tests/test_datastructures.py:710:13-33: `|` is not supported between `EnvironHeaders` and `dict[str, str]` [unsupported-operation]
+ ERROR tests/test_datastructures.py:716:13-34: `|=` is not supported between `EnvironHeaders` and `dict[str, str]` [unsupported-operation]

core (https://github.com/home-assistant/core)
+ ERROR homeassistant/components/local_calendar/calendar.py:224:13-24: `-` is not supported between `date` and `datetime` [unsupported-operation]
@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown

Primer Diff Classification

❌ 5 regression(s) | 5 project(s) total | +38 errors

5 regression(s) across freqtrade, pandas-stubs, pandas, werkzeug, core. error kinds: unsupported-operation, assert-type failures with Never, unsupported-operation false positives. caused by is_never(), try_dunder_call_pairs().

Project Verdict Changes Error Kinds Root Cause
freqtrade ❌ Regression +1 unsupported-operation pyrefly/lib/alt/operators.rs
pandas-stubs ❌ Regression +33 assert-type failures with Never pyrefly/lib/alt/operators.rs
pandas ❌ Regression +1 unsupported-operation try_dunder_call_pairs()
werkzeug ❌ Regression +2 NoReturn dunder methods treated as NotImplementedType pyrefly/lib/alt/operators.rs
core ❌ Regression +1 unsupported-operation is_never()
Detailed analysis

❌ Regression (5)

freqtrade (+1)

This is a false positive (regression). The values at dataframe.at[index, 'ask'] and dataframe.at[index, 'bid'] were just assigned numeric values (results of np.where(...).sum()) on lines 172-173. The subtraction on line 175 is perfectly valid at runtime. Pyrefly is incorrectly inferring the types as date and Timestamp — likely because DataFrame.at has a very broad return type annotation that includes many possible types, and the new operator resolution logic is now surfacing errors for union members that don't support subtraction, whereas before it found a successful resolution path. Neither mypy nor pyright flag this. The PR's change to operator resolution in pyrefly/lib/alt/operators.rs altered how union types interact with binary operators, causing this spurious error.
Attribution: The change in pyrefly/lib/alt/operators.rs modified the binary operator resolution logic. The new code strips NotImplementedType branches from return types and continues searching reflected dunders. This changed how operator return types are resolved. In this case, the - operator between the results of dataframe.at[index, 'ask'] and dataframe.at[index, 'bid'] is being resolved differently. The DataFrame.at accessor likely returns a union type that includes date and Timestamp among other possibilities. Previously, the operator resolution may have found a successful path early and returned. Now, with the NotImplementedType stripping logic, the resolution path changed — when a dunder returns NotImplementedType | <concrete>, the concrete part is accumulated and the search continues, potentially leading to different error reporting when some union members don't support the operator.

pandas-stubs (+33)

assert-type failures with Never: The PR's NotImplementedType stripping logic changes the inferred return type of binary operations. The stubs tests assert these operations should resolve to Never (indicating invalid usage), and mypy/pyright agree. Pyrefly now infers a different type, causing 29 false assert-type failures. This is a regression.
unsupported-operation false positives: 4 operations between pd.Index[bool] and numpy bool arrays now produce unsupported-operation errors that didn't exist before and aren't flagged by mypy/pyright. The PR's new logic for stripping NotImplementedType and continuing to search reflected dunders is incorrectly concluding these operations are unsupported. This is a regression.

Overall: This is a type stubs project (pandas-stubs) that is extensively tested against mypy and pyright. All 33 new errors are pyrefly-only — neither mypy nor pyright flags them. The errors fall into two categories:

  1. assert-type failures (29 errors): The test code uses assert_type(expr, Never) to verify that certain operations are statically invalid. Previously pyrefly inferred Never for these (matching the assertion), but the PR's new NotImplementedType stripping logic now infers a different type. Since mypy/pyright still agree with the Never assertion (0/29 co-reported), pyrefly's new inference is wrong for these cases.

  2. unsupported-operation errors (4 errors): Operations like left - b (where left is pd.Index[bool] and b is ndarray[..., dtype[bool]]) now produce unsupported-operation errors. These operations were previously accepted. Since mypy/pyright don't flag them (0/4 co-reported), these are false positives.

23/33 errors contain Never types, indicating inference failures. The PR's intent to strip NotImplementedType from binop results is reasonable, but the implementation has side effects that break correct type inference for pandas-stubs' operator overloads.

Attribution: The changes in pyrefly/lib/alt/operators.rs in the try_dunder_calls method (around line 186-240) changed how binary operator resolution handles NotImplementedType. Previously, when a dunder call succeeded but returned NotImplementedType, pyrefly would return that result directly. Now it strips NotImplementedType branches and continues searching reflected dunders. This changes the inferred return type for operations that previously resolved to Never (because the operation was considered unsupported). The new logic now either finds a reflected dunder that works (producing a concrete type instead of Never) or produces a different error. The assert_type(..., Never) calls in the stubs tests expected the old behavior where these operations resolved to Never, and now they resolve to something else. Additionally, the 4 unsupported-operation errors suggest that some operations that previously succeeded (perhaps returning NotImplementedType as a valid type) now fail because NotImplementedType is stripped and no valid fallback is found.

pandas (+1)

This is a false positive. The error claims - is not supported between date and Timestamp at line 1509, but this line is inside the is_number(endpoint) branch where start and end would be numeric values at runtime. The parameters start and end are untyped, so pyrefly infers their types from all possible values. Additionally, endpoint is a separate variable (set at line 1446 to start or end), so is_number(endpoint) being true doesn't allow pyrefly to narrow the types of start and end themselves. The maybe_box_datetimelike calls at lines 1444-1445 can produce Timestamp types, and date could come from the broad inferred input type. Since pyrefly cannot perform this cross-variable narrowing, it considers type combinations like date - Timestamp that would never occur at runtime in this branch. Neither mypy nor pyright flag this, confirming it's a false positive.
Attribution: The change to try_dunder_call_pairs() in pyrefly/lib/alt/operators.rs modified how binary operator results are computed. Previously, when a dunder call succeeded, it immediately returned the result (return ret). Now, it strips NotImplementedType branches from the return type and continues searching reflected dunders, accumulating results via union. This changed how the subtraction operator resolves for untyped parameters — the new logic likely produces a different (broader) union type for end - start that includes a date - Timestamp combination that wasn't considered before, triggering the unsupported-operation error.

werkzeug (+2)

NoReturn dunder methods treated as NotImplementedType: The PR's new logic in operators.rs strips NotImplementedType from binop return types and skips dunders whose filtered return is Never. But this also incorrectly skips dunders that genuinely return NoReturn (always raise). EnvironHeaders.or and ImmutableHeadersMixin.ior both return NoReturn because they raise TypeError. The new code skips these valid dunder implementations and then fails to find any operator, producing false positive 'unsupported-operation' errors. The test code wraps these in pytest.raises(TypeError), confirming the operations are intentionally expected to raise.

Overall: Both errors are false positives (regressions). The test code is intentionally testing that these operations raise TypeError at runtime — the expressions are wrapped in pytest.raises(TypeError). The type checker should recognize that EnvironHeaders.__or__ exists (it's explicitly defined on line 656) and ImmutableHeadersMixin.__ior__ exists (line 212 of mixins.py). Both return NoReturn because they always raise.

The PR's new logic in operators.rs treats NoReturn/Never return types from dunder methods the same as NotImplementedType — it skips them and continues searching for reflected dunders. But NoReturn (meaning 'always raises') is semantically different from NotImplementedType (meaning 'this operation is not supported, try the reflected method'). A dunder that returns NoReturn is a successful resolution — it means the operation will raise, which is valid behavior. The code should not skip NoReturn-returning dunders.

The ret_without_not_implemented filtering converts NotImplementedType to Never, then checks if ret_without_not_implemented.is_never() to skip. But if the original return type was already Never/NoReturn (not NotImplementedType), this check incorrectly skips it too. This is the root cause of both errors.

Error 1 (line 710) is pyrefly-only, confirming it's likely a false positive. Error 2 (line 716) is co-reported by pyright, but pyright may have its own reasons for flagging |= on immutable types — regardless, the pyrefly error message says the operator is 'not supported', which is incorrect since __ior__ is explicitly defined.

Per-category reasoning:

  • NoReturn dunder methods treated as NotImplementedType: The PR's new logic in operators.rs strips NotImplementedType from binop return types and skips dunders whose filtered return is Never. But this also incorrectly skips dunders that genuinely return NoReturn (always raise). EnvironHeaders.or and ImmutableHeadersMixin.ior both return NoReturn because they raise TypeError. The new code skips these valid dunder implementations and then fails to find any operator, producing false positive 'unsupported-operation' errors. The test code wraps these in pytest.raises(TypeError), confirming the operations are intentionally expected to raise.

Attribution: The changes in pyrefly/lib/alt/operators.rs modified the binary operator resolution logic. The new code strips NotImplementedType branches from dunder return types and continues searching reflected dunders. When EnvironHeaders.__or__ returns NoReturn (since it raises TypeError), the old code would have accepted this as a successful dunder call. The new code's NotImplementedType filtering logic may be interacting with NoReturn returns differently, or the stripping of NotImplementedType from Headers.__or__'s return type (which returns NotImplemented on non-Mapping input) causes the resolution to fail to find a valid operator implementation.

Looking more carefully: EnvironHeaders.__or__ (line 656-657 of headers.py) has return type t.NoReturn and always raises. The parent Headers.__or__ returns NotImplemented for non-Mapping inputs, or te.Self for Mapping inputs. When pyrefly resolves EnvironHeaders | dict[str, str], it should find EnvironHeaders.__or__ which has return type NoReturn. The new code in operators.rs checks if the return (after stripping NotImplementedType) is Never/NoReturn and if so, continues to the next dunder. This means it skips EnvironHeaders.__or__ and tries dict.__ror__, which doesn't accept EnvironHeaders. Since no valid operator is found, pyrefly reports unsupported-operation.

For |= (line 716): ImmutableHeadersMixin.__ior__ returns t.NoReturn. The new logic similarly skips this and can't find a fallback, reporting the error.

core (+1)

This is a regression/false positive. The code at line 224 performs end - start in the else branch of the isinstance check at line 216. The variables start and end are declared as datetime | date (lines 214-215). The isinstance check on line 216 tests isinstance(event.start, datetime) and isinstance(event.end, datetime) — in the else branch, at least one of event.start or event.end is not a datetime, meaning it could be a plain date. After assignment on lines 222-223, start and end retain the type datetime | date (since the negation of the and condition doesn't guarantee both are date).

The error message states - is not supported between date and datetime. This points to a subtle type system issue: when end is date and start is datetime, Python's date.__sub__ may return NotImplemented for a datetime argument, but datetime.__rsub__ handles it correctly. The operation works at runtime because Python's operator dispatch tries __rsub__ on the right operand when __sub__ on the left returns NotImplemented.

The PR's changes to binary operator resolution in pyrefly/lib/alt/operators.rs appear to not properly handle the __rsub__ fallback when __sub__ returns NotImplementedType, or incorrectly handle union types in this context. Neither mypy nor pyright flag this code, confirming it is a false positive introduced by the PR.

Attribution: The change in pyrefly/lib/alt/operators.rs in the try_call_dunder_with_fallbacks function modified how binary operator resolution works. The new logic strips NotImplementedType branches from return types and continues searching reflected dunders. When processing date.__sub__ with a datetime | date argument, the date.__sub__ overload that accepts date returns timedelta, but the overload resolution with datetime as the specific type may be returning NotImplementedType (since date.__sub__ doesn't have a specific datetime overload — it relies on datetime being a subclass of date). The new stripping logic may be incorrectly discarding valid results or failing to find a valid reflected dunder (datetime.__rsub__), ultimately producing a Never return type that triggers the unsupported-operation error. Specifically, the ret_without_not_implemented.[is_never()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs) check on line ~225 of the new code causes the loop to continue past valid results, and if no subsequent dunder call succeeds, the operation is reported as unsupported.

Suggested fixes

Summary: The PR's NotImplementedType stripping logic in try_dunder_call_pairs() incorrectly treats NoReturn/Never return types from dunder methods the same as NotImplementedType, causing false positive unsupported-operation errors across 5 projects.

**1. In try_dunder_call_pairs() in pyrefly/lib/alt/operators.rs, the check if ret_without_not_implemented.[is_never()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs) on ~line 225 incorrectly skips dunder methods that genuinely return NoReturn/Never (e.g., methods that always raise TypeError). The fix is to only skip when NotImplementedType was actually stripped — i.e., when ret_without_not_implemented != ret AND ret_without_not_implemented.[is_never()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs). If the original ret was already Never (no NotImplementedType was present), the dunder resolved successfully and should be returned. Change the logic to:

if ret_without_not_implemented.[`is_never()`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs) {
    if ret_without_not_implemented != ret {
        // NotImplementedType was stripped and nothing remained — skip to next dunder
        continue;
    } else {
        // Original return was already Never/NoReturn — this is a valid resolution (method raises)
        errors.extend(callee_errors);
        return self.union(successful_ret, ret);
    }
}

This preserves the intended behavior of stripping NotImplementedType while correctly handling NoReturn dunders.**

Files: pyrefly/lib/alt/operators.rs
Confidence: high
Affected projects: werkzeug
Fixes: unsupported-operation
The werkzeug errors directly demonstrate this: EnvironHeaders.or and ImmutableHeadersMixin.ior return NoReturn (they raise TypeError). The current code converts NotImplementedType→Never, checks is_never(), and continues. But when the original return type was already Never/NoReturn (no NotImplementedType present), ret_without_not_implemented == ret, and both are Never. The code incorrectly skips these valid dunder resolutions. This fix distinguishes between 'Never because we stripped NotImplementedType' vs 'Never because the method genuinely returns NoReturn'. This directly fixes the 2 werkzeug errors.

**2. In try_dunder_call_pairs() in pyrefly/lib/alt/operators.rs, when NotImplementedType is partially stripped from a union return type (ret_without_not_implemented != ret), the code accumulates the stripped result into successful_ret and continues to the next dunder. But when processing union-typed operands (e.g., date | datetime), this accumulation logic changes the resolution order and can cause valid operator paths to be skipped or produce different results than before. Specifically, when a dunder returns SomeType | NotImplementedType, the code strips NotImplementedType, adds SomeType to successful_ret, and continues to try reflected dunders. If the reflected dunder call then FAILS (call_errors is non-empty), it falls through to first_call. The final check if !successful_ret.[is_never()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs) returns successful_ret without the reflected dunder's contribution, potentially missing valid type combinations.

The fix: when a dunder returns a partial NotImplementedType union, after stripping NotImplementedType and accumulating the non-NotImplemented part, the code should still try reflected dunders for the NotImplementedType portion. But critically, if the reflected dunder succeeds, its result should also be accumulated into successful_ret. Currently the code does continue which goes to the next (dunder, target, arg) tuple in the loop, but the reflected dunder IS the next tuple — so this part may actually be working. The real issue is that for union operand types like datetime | date, the dunder resolution is called once with the full union, and the overload resolution within the call may return NotImplementedType for some union members but not others. The stripping then loses information about WHICH union members succeeded.

A more targeted fix: before the ret_without_not_implemented.[is_never()](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs) continue, check if the original ret type is a pure NotImplementedType (not a union containing it). Only skip (continue) if the return is purely NotImplementedType. If it's a union like timedelta | NotImplementedType, accumulate the non-NotImplemented part AND still return it combined with successful_ret (don't continue to try reflected dunders for the whole operand, since part of the operation already succeeded):

if ret_without_not_implemented.[`is_never()`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/alt/operators.rs) && ret_without_not_implemented != ret {
    // Pure NotImplementedType return — skip to reflected dunder
    continue;
}
if ret_without_not_implemented != ret {
    // Partial NotImplementedType — accumulate successful part and continue
    // to try reflected dunder for the NotImplemented cases
    successful_ret = self.union(successful_ret, ret_without_not_implemented);
    continue;
}
// No NotImplementedType at all — return immediately
errors.extend(callee_errors);
return self.union(successful_ret, ret);

This is essentially what the current code does, but combined with the NoReturn fix above, it should handle the date - datetime case correctly because date.__sub__(datetime) would return timedelta (since datetime is a subclass of date), not NotImplementedType.**

Files: pyrefly/lib/alt/operators.rs
Confidence: medium
Affected projects: freqtrade, pandas, core
Fixes: unsupported-operation
The freqtrade, pandas, and core errors all involve date - Timestamp or date - datetime operations on union-typed operands. The changed resolution logic produces different results for these union types. The core issue is that date.__sub__ accepts date and datetime is a subclass of date, so date.__sub__(datetime_instance) should return timedelta. If pyrefly's overload resolution for date.__sub__ doesn't recognize datetime as a valid date argument (perhaps due to strict overload matching), it may return NotImplementedType for that combination. The stripping logic then discards this and tries datetime.__rsub__, which may also have issues. This fix combined with the NoReturn fix should eliminate the 4 unsupported-operation errors across freqtrade, pandas, and core.

**3. In try_dunder_call_pairs() in pyrefly/lib/alt/operators.rs, the pandas-stubs regressions (29 assert-type failures + 4 unsupported-operation) stem from the same root cause: the NotImplementedType stripping changes how operator return types are inferred. Previously, operations that resolved to NotImplementedType (or a type containing it) would be returned as-is, and downstream type narrowing would collapse them to Never for truly unsupported operations. Now, the stripping causes different type inference paths.

For the 29 assert-type failures: tests like assert_type(expr, Never) expected that certain invalid operations would infer as Never. The old code would fail to find a valid operator and return Never. The new code's accumulation logic (successful_ret union) may find partial matches that produce a non-Never type, breaking the assertion.

The NoReturn fix (suggestion 1) is the highest priority and most clearly correct. The pandas-stubs issues may partially resolve once the NoReturn/Never distinction is fixed, since some of the 29 failures may involve dunder methods that return NoReturn. For the remaining cases, the accumulation of partial results via successful_ret = self.union(successful_ret, ret_without_not_implemented) followed by continue means that even when later dunders fail, the accumulated partial result is returned instead of falling through to the error path. This is the intended behavior for NotImplementedType handling, but it changes inference for edge cases in pandas stubs.

No additional code change suggested beyond suggestions 1 and 2 — the pandas-stubs issues likely share the same root causes.**

Files: pyrefly/lib/alt/operators.rs
Confidence: medium
Affected projects: pandas-stubs
Fixes: assert-type, unsupported-operation
The 33 pandas-stubs errors are all pyrefly-only. The assert-type failures indicate that type inference changed for operations that should resolve to Never. The unsupported-operation errors indicate operations that should succeed now fail. Both are consequences of the same NotImplementedType stripping logic. Fixing the NoReturn distinction (suggestion 1) should address some of these, and the remaining ones may need the accumulation logic to be more careful about when to return accumulated results vs falling through to the error path.


Was this helpful? React with 👍 or 👎

Classification by primer-classifier (5 LLM)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants