1
\$\begingroup\$

Second revision of the original post: Function overloading / dynamic dispatch for Python


TL;DR:

Improved version of this library[^1] based on previous reviews and criticism. It provides runtime function overloading for Python 3, offering strict type guarantees unlike typing.overload) by enforcing dispatch at runtime.

Improvements on debugging better error messages and an option for debug printing; ergonomics dropped need for foward declaration of classes with overloaded methods; performace, with a primitive caching strategy to reduce dispatch overhead in repeated calls.


Motivation

When I first started using Python, I struggled with some aspects of its dynamically typed nature. Coming from languages with strong type systems, I was accustomed to leveraging type information to define polymorphic (or "overloaded") functions and methods. Initially, I implemented this pattern in Python using ad-hoc type checks and stubs, but this led to excessive boilerplate, brittle function logic, and complications when working with methods.

Overloading in Python has traditionally been handled through various workarounds (using *args/**kwargs, explicit type-checking, or even the typing.overload decorator for stubs). While typing.overload helps with static type checking, it doesn't enforce type guarantees at runtime. This library aims to deliver dynamic dispatch with stricter type checking and enhanced runtime type safety—making it a more robust solution than existing ad-hoc implementations.

A couple of years ago, I developed a library that provides a concise way to define overloaded functions using a decorator. It has been a core part of my Python projects ever since, but until recently, it hadn’t been reviewed outside my team. After posting it for review on this site, I received valuable feedback on type inference, performance optimizations, and method overloading ergonomics. This is a revised version that incorporates feedback, primarly from the two answers by Peilonrayz and Reinderien, and comments on the question.


Review request

This revision addressed much of the original feedback, non the less, I'd like to ask the original more general questions:

  1. Does the interface seem ergonomic?;
  2. Is the code remotely readable/understandable?
  3. Am I still missing something, i.e., are there any glaring issues with the code?
  4. Could the candidate selection strategy be improved?
  5. Altought performance is not a main issue (it is a library for an interpreted language after all), are there any obvious areas where runtime overhead could be further reduced? and
  6. How could I approach a rewrite to __call__ as to allow for the overloads to return proper types and not Any, to leverage type checking tools like MyPy.

While writing this new version, and preparing this review request, I've identified some more focused questions:

Type Inference

In the current implementation, unions, container annotations, and numeric compatibility (e.g., int vs. float, complex) have been explicitly addressed.

  • Does the logic for handling these cases seem reasonable? Are there additional instances where typed Python diverges from untyped behavior that should be addressed?

  • Could you suggest refinements in the type inference logic (for example, in validate_param_type and validate_container) to improve precision without overly complicating the code?

Caching and Runtime Dispatch

The library now includes a caching mechanism to reduce repeated overload resolution overhead.

  • Does the caching strategy (based on parameter type tuples) seem effective for practical scenarios? Do you foresee any pitfalls or potential improvements to further optimize this?

Ergonomics

To address the forward declaration shenanigans, this implementation uses deferred initialization via a metaclass.

  • Does the deferred initialization approach for class methods feel intuitive and robust? Are there any improvements or alternative techniques that would enhance usability while preserving the natural interface for overloaded methods?

The Library

"""
===============
sobrecargar.py
==============
"""
__all__ = ['sobrecargar', 'overload']

from inspect import signature as get_signature, Signature, Parameter, currentframe as current_frame, getframeinfo as get_frame_info
from types import MappingProxyType
from typing import Callable, TypeVar, Iterator, ItemsView, Any, List, Tuple, Iterable, Generic, Optional, Unpack, Union, get_origin, get_args
from collections.abc import Sequence, Mapping
from collections import namedtuple
from functools import partial
from sys import modules, version_info
from itertools import zip_longest
from os.path import abspath as absolute_path

if version_info < (3, 11):
    from typing_extensions import Self
else:
    from typing import Self
    
if version_info < (3, 9):
    raise ImportError("Module 'sobrecargar' requires Python 3.9 or higher.")
    
class _DeferredOverload(type):
    """Metaclass that handles deferred initialization of overloads, existing only to handle the case of overloading class/instance methods.
    When decorating a function/method with @overload, instead of creating an instance of `overload`, an instance of `_DeferredOverload` is created,
    which behaves *as if* it were `overload` and retains all the state needed to build the real instance later, only when the overloaded
    function or method is called for the first time.
    """
    def __init__(cls, name, bases, namespace):
        super().__init__(name, bases, namespace)

        class _Deferred(object): 
            def __new__(cls_inner, positional, keywords):
                obj = cls.__new__(cls, *positional, *keywords)
                if not hasattr(obj, "_Deferred__initial_params") or getattr(obj, "_Deferred__initial_params") is None:
                    obj.__initial_params = []
                obj.__initial_params.append((positional, keywords))
                obj.__class__ = cls_inner
                return obj

            def __initialize__(self):
                initial = self.__initial_params
                del self.__dict__['_Deferred__initial_params']
                super().__setattr__('__class__', cls)
                for positional, keywords in initial:
                    self.__init__(*positional, **keywords)
            def __get__(self, obj, obj_type):
                self.__initialize__()
                return self.__get__(obj, obj_type)
            def __call__(self, *positional, **keywords):
                self.__initialize__()
                return self.__call__(*positional, **keywords)
    
        _Deferred.__name__ = f"{cls.__name__}_Deferred"
        _Deferred.__qualname__ = f"{cls.__qualname__}_Deferred"
        cls._Deferred = _Deferred
        
    def __call__(cls, *positional, **keywords):    
        return cls._Deferred(positional, keywords)
    
    def __instancecheck__(cls, instance):
        return super().__instancecheck__(instance) or isinstance(instance, cls._Deferred)

    def __subclasscheck__(cls, subclass):
        return super().__subclasscheck__(subclass) or (subclass == cls._Deferred)


import __main__

class _sobrecargar(metaclass=_DeferredOverload):
    """
    Class that acts as a decorator for functions, allowing multiple
    versions of a function or method to be defined with different sets of parameters and types.
    This enables function overloading (i.e., dynamic dispatch based on the provided arguments).

    Class Attributes:
        _overloaded (dict): A dictionary that keeps a record of '_overload' instances created
        for each decorated function or method. The keys are the names of the functions or methods,
        and the values are the corresponding '_overload' instances.

    Instance Attributes:
        overloads (dict): A dictionary storing the defined overloads for the decorated function or method.
        The keys are Signature objects representing the overload signatures, and the values are the
        corresponding functions or methods.

        __cache (dict): A dictionary that maps parameter type combinations in the call to the underlying
        function object to be called. A simple optimization that reduces the cost for subsequent calls,
        which is very useful in loops.

        __debug (Callable): A lambda that prints diagnostic information if the overload is initialized in debug mode,
        otherwise it does nothing.
    """
    _overloaded : dict[str, '_overload'] = {}

    def __new__(cls, function: Callable, *positional, **keywords) -> '_overload':
        """
        Constructor. Creates a unique instance per function name.
        Args:
            function (Callable): The function or method to be decorated.
        Returns:
            _overload: The instance of the '_overload' class associated with the provided function name.
        """

        name: str = cls.__full_name(function)
        if name not in cls._overloaded.keys(): 
            cls._overloaded[name] = super().__new__(_overload)
            cls._overloaded[name].__name = function.__name__
            cls._overloaded[name].__full_name = name

        return cls._overloaded[name]

    def __init__(self, function: Callable, *, cache: bool = True, debug: bool = False) -> None:
        """
        Initializer. Responsible for initializing the overload dictionary (if not already present)
        and registering the current version of the decorated function or method.

        Args:
            function (Callable): The decorated function or method.
            cache (bool): Option indicating whether the overload should use caching.
            debug (bool): Option indicating whether to initialize in debug mode.
        """

        if not hasattr(self, 'overloads'):
            self.overloads : dict[Signature, Callable] = {}

        self.__cache : Optional[dict[tuple[tuple[type[Any], ...], tuple[tuple[str, type[Any]]]], Callable[..., Any]]] = (
            self.__cache if hasattr(self, "_overload__cache") and self.__cache is not None else {} if cache else None
        )
        self.__debug = (
            self.__debug if hasattr(self, "_overload__debug") and self.__debug is not None 
            else (lambda msg: print(f"[DEBUG] {msg}") if debug else lambda msg: None)
        )

        signature_obj: Signature
        underlying_function: Callable
        signature_obj, underlying_function = _overload.__unwrap(function)
        signature_obj, underlying_function = _overload.__unwrap(function)

        self.__debug(f"Overload registered for: {self.__name}. Signature: {signature_obj}")
        if type(self).__is_method(function):
            cls: type = type(self).__get_class(function)
            self.__debug(f"{self.__name} is a method of {cls}.")
            self.__debug(f"{self.__name} is a method of {cls}.")
            for ancestor in cls.__mro__:
                for base in ancestor.__bases__:
                    if base is object: break
                    full_method_name: str = f"{base.__module__}.{base.__name__}.{function.__name__}"
                    if full_method_name in type(self)._overloaded.keys():
                        base_overload: '_overload' = type(self)._overloaded[full_method_name]
                        self.overloads.update(base_overload.overloads)

        self.overloads[signature_obj] = underlying_function
        if not self.__doc__: self.__doc__ = ""
        self.__doc__ += f"\n{function.__doc__ or ''}"
            
    def __call__(self, *positional, **keywords) -> Any:
        """
        Method that allows the decorator instance to be called as a function.
        The core engine of the module. It validates the provided parameters and builds a tuple
        of 'candidates' from functions that match the provided parameters. It prioritizes the overload
        that best fits the types and number of arguments. If several candidates match, it propagates the result
        of the most specific one.

        If caching is enabled, the selected function is stored for later calls.

        Args:
            *positional: Positional arguments passed to the function or method.
            **keywords: Keyword arguments passed to the function or method.

        Returns:
            Any: The result of the selected version of the decorated function or method.

        Raises:
            TypeError: If no compatible overload exists for the provided parameters.
        """

        if self.__cache is not None:
            parameters = (
                tuple(type(p) for p in positional), 
                tuple((n, type(v)) for n, v in keywords.items()),
            )
            if parameters in self.__cache.keys():
                func = self.__cache.get(parameters)
                self.__debug(
                        f"Cached call for {self.__name}"
                        f"\n\tProvided positional parameters: {', '.join(f'{type(p).__name__} [{repr(p)}]' for p in positional)}"
                        f"\n\tProvided keyword parameters: {', '.join(f'{k}: {type(v).__name__}  [{v}]' for k, v in keywords.items())}"
                        f"\n\tCached signature: {get_signature(func)}"
                    )

                return func(*positional, **keywords)
            
        
        self.__debug(
                f"Starting candidate selection for {self.__name}"
                f"\n\tProvided positional parameters: {', '.join(f'{type(p).__name__} [{repr(p)}]' for p in positional)}"
                f"\n\tProvided keyword parameters: {', '.join(f'{k}: {type(v).__name__} [{v}]' for k, v in keywords.items())}"
                f"\n\tSupported overloads:"
                f"\n" + "\n".join(
                    f"\t- {', '.join(f'{v}' for v in dict(sig.parameters).values())}"
                    for sig in self.overloads.keys()
                )
            )

        _C = TypeVar("_C", bound=Sequence)
        _T = TypeVar("_T", bound=Any)
        Candidate = namedtuple('Candidate', ['score', 'function_object', "function_signature"])
        candidates: List[Candidate] = []

        def validate_container(value: _C, container_param: Parameter) -> int | bool:
            type_score: int = 0

            container_annotation = container_param.annotation

            if not hasattr(container_annotation, "__origin__") or not hasattr(container_annotation, "__args__"):
                type_score += 1
                return type_score

            if get_origin(container_annotation) is Union:
                if not issubclass(type(value), get_args(container_annotation)):
                    return False
            elif not issubclass(type(value), container_annotation.__origin__): 
                return False
            container_args: Tuple[type[_C]] = container_annotation.__args__
            has_ellipsis: bool = Ellipsis in container_args
            has_single_type: bool = len(container_args) == 1 or has_ellipsis

            if has_ellipsis:
                aux_list: list = list(container_args)
                aux_list[1] = aux_list[0]
                container_args = tuple(aux_list)

            type_iterator: Iterator
            if has_single_type:
                type_iterator = zip_longest((type(t) for t in value), container_args, fillvalue=container_args[0])
            else:
                type_iterator = zip_longest((type(t) for t in value), container_args)

            if not issubclass(type(value[0]), container_args[0]):
                return False

            for received_type, expected_type in type_iterator:
                if expected_type == None: 
                    return False
                if received_type == expected_type:
                    type_score += 2               
                elif issubclass(received_type, expected_type):
                    type_score += 1
                else:
                    return False
            return type_score

        def validate_param_type(value: _T, func_param: Parameter) -> int | bool:
            type_score: int = 0

            expected_type = func_param.annotation 
            received_type: type[_T] = type(value)

            is_untyped: bool = (expected_type == Any)
            default_value: _T = func_param.default
            is_null: bool = value is None and default_value is None

            is_default: bool = value is None and default_value is not func_param.empty
            param_is_self: bool = func_param.name == 'self' or func_param.name == 'cls'
            
            param_is_var_pos: bool = func_param.kind == func_param.VAR_POSITIONAL 
            param_is_var_kw: bool = func_param.kind == func_param.VAR_KEYWORD  
            param_is_variable: bool = param_is_var_pos or param_is_var_kw
            param_is_union: bool = hasattr(expected_type, "__origin__") and get_origin(expected_type) is Union
            param_is_container: bool = (hasattr(expected_type, "__origin__") or (issubclass(expected_type, Sequence) and not issubclass(expected_type, str)) or issubclass(expected_type, Mapping)) and not param_is_union
            
            numeric_compatible: bool = (issubclass(expected_type, complex) and issubclass(received_type, (float, int))
                                          or issubclass(expected_type, float) and issubclass(received_type, int))
            """Check the special case where typed Python diverges from untyped Python.
                See: https://typing.python.org/en/latest/spec/special-types.html#special-cases-for-float-and-complex
            """

            is_different_type: bool
            if param_is_variable and param_is_container and param_is_var_pos:
                expected_type = expected_type.__args__[0] if get_origin(type(expected_type)) is Unpack else expected_type
                is_different_type = not issubclass(received_type, expected_type.__args__[0])
            elif param_is_variable and param_is_container and param_is_var_kw:
                expected_type = expected_type.__args__[0] if get_origin(type(expected_type)) is Unpack else expected_type
                is_different_type = not issubclass(received_type, expected_type.__args__[1])
            elif param_is_union:
                is_different_type = not issubclass(received_type, get_args(expected_type))
            elif param_is_container:
                is_different_type = not validate_container(value, func_param)
            else:
                is_different_type = not (
                    issubclass(received_type, expected_type)
                    or numeric_compatible
                )
            
            if not is_untyped and not is_null and not param_is_self and not is_default and is_different_type:
                return False
            elif param_is_variable and not param_is_container: 
                type_score += 1
            else:
                if param_is_variable and param_is_container and param_is_var_pos:
                    if received_type == expected_type.__args__[0]:
                        type_score += 3
                    elif issubclass(received_type, expected_type.__args__[0]):
                        type_score += 1  
                elif param_is_variable and param_is_container and param_is_var_kw:
                    if received_type == expected_type.__args__[1]:
                        type_score += 3
                    elif issubclass(received_type, expected_type.__args__[1]):
                        type_score += 1  
                elif param_is_container:
                    type_score += validate_container(value, func_param)
                elif received_type == expected_type:
                    type_score += 5
                elif issubclass(received_type, expected_type):
                    type_score += 4
                elif numeric_compatible:
                    type_score += 3
                elif is_default:  
                    type_score += 2
                elif is_null or param_is_self or is_untyped:
                    type_score += 1

            return type_score

        def validate_signature(func_params: MappingProxyType[str, Parameter], positional_count: int, positional_iterator: Iterator[tuple], keyword_view: ItemsView) -> int | bool:
            signature_score: int = 0

            this_score: int | bool
            for pos_value, pos_name in positional_iterator:
                this_score = validate_param_type(pos_value, func_params[pos_name])
                if this_score:
                    signature_score += this_score 
                else:
                    return False
            
            for key_name, key_value in keyword_view:
                if key_name not in func_params and type(self).__has_var_kw(func_params):
                    var_kw: Optional[Parameter] = next((p for p in func_params.values() if p.kind == p.VAR_KEYWORD), None)
                    if var_kw is not None:
                        this_score = validate_param_type(key_value, var_kw)
                    else:
                        return False
                elif key_name not in func_params:
                    return False
                else:
                    this_score = validate_param_type(key_value, func_params[key_name])
                if this_score:
                    signature_score += this_score 
                else:
                    return False

            
            return signature_score

        for sig, function in self.overloads.items():

            length_score: int = 0
            
            func_params: MappingProxyType[str, Parameter] = sig.parameters
            
            positional_count: int = len(func_params) if type(self).__has_var_pos(func_params) else len(positional) 
            keyword_count: int = len({key: keywords[key] for key in func_params if key in keywords}) if (type(self).__has_var_kw(func_params) or type(self).__has_only_kw(func_params)) else len(keywords)
            default_count: int = type(self).__has_default(func_params) if type(self).__has_default(func_params) else 0
            positional_iterator: Iterator[tuple[Any, str]] = zip(positional, list(func_params)[:positional_count]) 
            keyword_view: ItemsView[str, Any] = keywords.items()
            
            if (len(func_params) == 0 or not (type(self).__has_variables(func_params) or type(self).__has_default(func_params))) and len(func_params) != (len(positional) + len(keywords)):
                continue             
            if len(func_params) - (positional_count + keyword_count) == 0 and not (type(self).__has_variables(func_params) or type(self).__has_default(func_params)):
                length_score += 3
            elif len(func_params) - (positional_count + keyword_count) == 0:
                length_score += 2
            elif (0 <= len(func_params) - (positional_count + keyword_count) <= default_count) or (type(self).__has_variables(func_params)):
                length_score += 1
            else:
                continue

            signature_validation_score: int | bool = validate_signature(func_params, positional_count, positional_iterator, keyword_view)
            if signature_validation_score:
                candidate: Candidate = Candidate(score=(length_score + 2 * signature_validation_score), function_object=function, function_signature=sig)
                candidates.append(candidate)
            else:
                continue
        if candidates:
            if len(candidates) > 1:
                candidates.sort(key=lambda c: c.score, reverse=True)
            self.__debug(f"Candidates: \n\t- " + "\n\t- ".join(' | '.join([str(i) for i in c if not callable(i)]) for c in candidates))
            best_function = candidates[0].function_object
            if self.__cache is not None:
                parameters = (
                    tuple(type(p) for p in positional),
                    tuple(tuple(n, type(v)) for n, v in keywords.items()),
                )
                self.__cache.update({
                    parameters: best_function
                })
            return best_function(*positional, **keywords)
        else:
            call_frame = current_frame().f_back
            frame_info = get_frame_info(call_frame)
            if "return self.__call__(*positional,**keywords)" in frame_info.code_context and frame_info.function == "__call__":
                frame_info = call_frame.f_back
            raise TypeError(
                f"[ERROR] Could not call {function.__name__} in {absolute_path(frame_info.filename)}:{frame_info.lineno} " 
                f"\n\tProvided parameters" 
                f"\n\t- Positional: {', '.join(p.__name__ for p in map(type, positional))}"
                f"\n\t- Keywords: {', '.join(f'{k}: {type(v).__name__}' for k, v in keywords.items())}"
                f"\n"
                f"\n\tSupported overloads:\n" +
                "\n".join(
                    f"\t- {', '.join(f'{v}' for v in dict(sig.parameters).values())}"
                    for sig in self.overloads.keys()
                )
            )
    
    def __get__(self, obj, obj_type):
        class OverloadedMethod:
            __doc__ = self.__doc__
            __call__ = partial(self.__call__, obj) if obj is not None else partial(self.__call__, obj_type)

        return OverloadedMethod()

    @staticmethod
    def __unwrap(function: Callable) -> Tuple[Signature, Callable]:
        while hasattr(function, '__func__'):
            function = function.__func__
        while hasattr(function, '__wrapped__'):
            function = function.__wrapped__

        sig: Signature = get_signature(function)
        return (sig, function)

    @staticmethod
    def __full_name(function: Callable) -> str:
        return f"{function.__module__}.{function.__qualname__}"

    @staticmethod
    def __is_method(function: Callable) -> bool:
        return function.__name__ != function.__qualname__ and "<locals>" not in function.__qualname__.split(".")

    @staticmethod
    def __is_nested(function: Callable) -> bool:
        return function.__name__ != function.__qualname__ and "<locals>" in function.__qualname__.split(".")

    @staticmethod
    def __get_class(method: Callable) -> type:
        return getattr(modules[method.__module__], method.__qualname__.split(".")[0])

    @staticmethod
    def __has_variables(func_params: MappingProxyType[str, Parameter]) -> bool:
        for param in func_params.values():
            if _overload.__has_var_kw(func_params) or _overload.__has_var_pos(func_params): 
                return True
        return False

    @staticmethod
    def __has_var_pos(func_params: MappingProxyType[str, Parameter]) -> bool:
        for param in func_params.values():
            if param.kind == Parameter.VAR_POSITIONAL: 
                return True
        return False

    @staticmethod
    def __has_var_kw(func_params: MappingProxyType[str, Parameter]) -> bool:
        for param in func_params.values():
            if param.kind == Parameter.VAR_KEYWORD: 
                return True
        return False

    @staticmethod
    def __has_default(func_params: MappingProxyType[str, Parameter]) -> int | bool:
        default_count: int = 0
        for param in func_params.values():
            if param.default != param.empty: 
                default_count += 1
        return default_count if default_count else False 
    
    @staticmethod
    def __has_only_kw(func_params: MappingProxyType[str, Parameter]) -> bool:
        for param in func_params.values():
            if param.kind == Parameter.KEYWORD_ONLY: 
                return True
        return False


def sobrecargar(*args, cache: bool = True, debug: bool = False) -> Callable:
    """Function decorator that transforms functions into overloads.
    **Parameters:** 
        :param Callable f: the function to be overloaded.
        :param bool cache: indicates whether to cache the dispatch result. Default: True.
        :param bool debug: indicates whether to print diagnostic information. Default: False.
    
    **Returns:**  
        :param Callable: the decorator.
    ---  
    """

    if args and callable(args[0]):
        return _sobrecargar(args[0], cache=cache, debug=debug)
    def decorator(f):
        if debug:
            frame_info = get_frame_info(current_frame().f_back)
            print(
                f"[DEBUG] Function overload."
                f"\n\t{f.__name__} in {absolute_path(frame_info.filename)}:{frame_info.lineno}"
                f"\n\t- cache = {cache}"
                f"\n\t- debug = {debug}"
            )
        return _sobrecargar(f, cache=cache, debug=debug)
    return decorator

# Alias
overload = sobrecargar


if __name__ == '__main__': 
    print(__doc__)

English documentation

Overview

Sobrecargar is a Python library that enables function overloading using a decorator. It allows defining multiple versions of the same function or method, dynamically selecting the most appropriate one based on the provided arguments. This facilitates a structured and type-safe approach to function polymorphism.

Python does not natively support function overloading, requiring alternative approaches such as:

  • *args and **kwargs with manual type checking
  • Writing separate functions for different cases
  • Using typing.overload, which only provides static validation but does not affect runtime behavior

Sobrecargar eliminates repetitive code and improves safety by enforcing strict type checks at runtime.

Overloading a Function

You can use @sobrecargar or its alias @overload to define multiple versions of a function with different signatures.

from sobrecargar import sobrecargar  

@sobrecargar  
def process(value: int):  
    print(f"Processing an integer: {value}")  

@sobrecargar  
def process(value: str):  
    print(f"Processing a string: {value}")  

process(42)   # Processing an integer: 42  
process("Hello")  # Processing a string: Hello  

Overloading a Class Method

Note: Since version 3.0.2, method overloading is handled the same way as function overloading.

from sobrecargar import sobrecargar  

class MyClass:  
    @sobrecargar  
    def show(self, value: int):  
        print(f"Received integer: {value}")  

    @sobrecargar  
    def show(self, value: str):  
        print(f"Received string: {value}")  

obj = MyClass()  
obj.show(10)    # Received integer: 10  
obj.show("Hello")  # Received string: Hello  

Example with Caching and Debugging

Since version 3.1.0, the decorator allows enabling caching and debug mode:

@sobrecargar(cache=True, debug=True)  
def calculate(a: float, *args: int):  
    return a * sum(args)  

@sobrecargar  # cache=True and debug=True inherited from the first overload  
def calculate(a: float, b: float):  
    return a * b  

floats: Iterable[tuple[float,float]] = ...
for a,b in floats: 
    calculate(a,b)  # In this scenario the overload resolution logic 
                    # is only run in the first iteration of the loop,
                    # subsequent calls only incur in the cost of
                    # looking up the overload cache. 

Configuration Options

The @sobrecargar decorator accepts the following configuration parameters:

Parameter Description Default Value Since Version
cache Enables caching for overload resolution. True 3.1.X
debug Prints debugging messages to the console. False 3.1.X

Note: If one overload declares a configuration parameter, it applies to all overloads of the same function.


Example use case

Revised version of the bussines example from the original post.

By far the most frequent use case I, personally have for function overloading is Class constructor overload.

Consider some rudimentary database record model. Given a table Products:

id SKU Title Artist (FK) Description Format Price ...
1 A-123-C-77 Jazz in Ba 899 ... CD 5.99 ...
2 A-705-V-5 We'll be togheter at last 7566 ... Vynil 8.99 ...
3 B-905-C-5 Ad Cordis 123 ... CD 3.99 ...
4 B-101-C-77 Brain Damage 1222 ... CD 3.99 ...
... ... ... ... ... ... ...

One could define a class Products that represents that table:

class SomeDbAbstraction: 
    ...
    def run_query(query : str, ...) -> dict[str,Any]: ...
    def get_insert_id() -> int: ...
    ...

class Format(Enum):
    _invalid = 0
    CD = 1
    Vynil = 2

class Artist: ...

class Product:
    __slots__(
        "__id",
        "sku",
        "title",
        "artist",
        "description",
        "format",
        "price",
        ...
    )

    def __init__(
        self,
        id : int,
        sku : str,
        title : str,
        artist : Artist,
        description : str,
        format : Format,
        price : float,
        ...
    ) -> None:
        self.__id = id
        self.sku = title
        self.title = title
        self.artist = artist
        self.description = description
        self.format = format
        self.price = price
        ...

Now, imagine this model is being consumed by the controller an let's say that the values for each column can come from varied sources, e.g., a read from the database, a JSON api endpoint, an HTTP form, some other Python code, &c.

So, a need for a way to construct valid records with different available data appears. An overly complicated solution with some factory patterns and the like is possible, but without deep diving into an ocean of abstraction layers we kind of have two pythonic approaches:

1. Using Distinct Functions

# Approach 1: Distinct functions for each use case

def product_new(db: SomeDbAbstraction, sku: str, title: str, artist: Artist,
                description: str, format: Format, price: float) -> Product:
    """Creates a new product record with a complete dataset (artist provided as an instance)."""
    new_id = db.run_query(
        "INSERT INTO Product (sku, title, artist, description, format, price) VALUES (%s, %s, %s, %s, %s, %s);",
        sku, title, artist.id, description, format.name, price
    ).get_insert_id()
    return Product(new_id, sku, title, artist, description, format, price)


def product_from_id(db: SomeDbAbstraction, id: int) -> Product:
    """Retrieves a product record by primary key."""
    record = db.run_query("SELECT * FROM Product WHERE id = %s;", id)
    return Product(
        record.get("id"),
        record.get("sku"),
        record.get("title"),
        Artist.from_id(db, record.get("artist")),
        record.get("description"),
        Format(record.get("format")),
        record.get("price")
    )


def product_from_sku(db: SomeDbAbstraction, sku: str) -> Product:
    """Retrieves a product record by SKU."""
    record = db.run_query("SELECT * FROM Product WHERE sku = %s;", sku)
    return Product(
        record.get("id"),
        record.get("sku"),
        record.get("title"),
        Artist.from_id(db, record.get("artist")),
        record.get("description"),
        Format(record.get("format")),
        record.get("price")
    )

In each of those cases the user of the model needs to explicitly choose the function for that case, adding a new case requeris a new fucntion, and most important: **any change to the Product class API (particularly in the intilization method) requieres modifying each utility function (and potentially breaking the model's overall API). It adds cognitive overhead: in my option it kneecaps one of the characteristics of dynamically-typed languages, it imposes the requirement that the caller has to know with great certainty which types it's passing to the callee.

2. Using typing.overload with Stubs and a Unified Implementation

Now this is a common enough problem dominion that the typing standard library introduced an attempt for a solution: a half-implementation for the overload pattern. This solution is utilized both by standard library modules and while bunch of widely-used libraries. A solutión using typing.overload for Product initialization could look like:

from typing import overload

@overload
def __init__(slef, db: SomeDbAbstraction, sku: str, title: str, artist: Artist,
                   description: str, format: Format, price: float) -> None: ...
@overload
def __init__(slef, db: SomeDbAbstraction, id: int) -> None: ...
@overload
def __init__(slef, db: SomeDbAbstraction, sku: str) -> None: ...

def __init__(slef, db: SomeDbAbstraction, *args) -> None:
    """
    Unified implementation for creating or retrieving a Product.
    
    The overloads above provide distinct type signatures, while this single implementation
    determines the operation based on the number and type of arguments.
    """
    id : int = 0
    title : str = ""
    artist : typing.Optional[Artist] = None
    description : str = ""
    fmt : Format = Format(0)
    price = decimal.Decimal = Decimal(0)
    match len(args):
      case 1:
        # Single argument: could be an int (id) or str (sku)
        if isinstance(args[0], int):
            record_info = db.run_query("SELECT * FROM Product WHERE id = %s;", args[0])
        elif isinstance(args[0], str):
            record_info = db.run_query("SELECT * FROM Product WHERE sku = %s;", args[0])
        else:
            raise TypeError("Invalid type for single argument; expected int or str.")
        
         id = record_info.get("id"),
         sku = record_info.get("sku"),
         title = record-info.get("title"),
         artist = Artist.from_id(db, record.get("artist")),
         description = record_info.get("description"),
         fmt = Format(record.get("format")),
         price = decimal.Decimal(record_info.get("price"))
        
      case 6:
        # Full dataset provided: assume order is sku, title, artist, description, format, price.
        sku, title, artist, description, fmt, price = args
        id = db.run_query(
            "INSERT INTO Product (sku, title, artist, description, format, price) VALUES (%s, %s, %s, %s, %s, %s);",
            sku, title, artist.id, description, fmt.name, price
        ).get_insert_id()
      case _:
        raise TypeError("Invalid number of arguments for creating or retrieving a Product.")
    slef.id = id
    slef.sku = sku
    slef.title = title
    slef.description = description
    slef.artist = artist
    slef.format = fmt

This example is far from perfect. Some stricter **kwargs handling (instead of implicitly postitioned *args) would improve it - at the cost of added runtime overhead. But the core remains. The solution that typing.overload offers by itself amounts to a standard template for LSPs and type checkers. It does not give any type assurances (those are deferred for the actual unified implementation), it parasites duck-typing in such away that the actual implementation often gets muddied with ad-hoc type checking, it requieres extra boilerplate and as it's only a hint, when the exact same set of instructions can't handle the diverse inputs, for the function to "plow ahead regardless" (as per the example), branches need be introduced, imposing a runtime cost and potentially leading to brittle and coupled code.

I do agree that between those two options, option 1 seems better for most scenarios. If the wide-api cognitive overhead is managable, the explicit approach is cleaner, it produces no runtime overhead, and favours easier direct documentation of the api. I feel that costs for the typing.overload approach generally outweight its gains.

With that in mind, consider the solution this library proposes:

3. sobrecargar

from sobrecargar import overload  # 'overload' is an alias for the library decorator

    @overload
    def __init__(slef, db: SomeDbAbstraction, sku: str, title: str, artist: Artist,
                 description: str, format: Format, price: float) -> None:
        """
        Overload for creating a new product with full dataset (artist instance).
        
        Inserts a new record and initializes the Product instance.
        """
        new_id = db.run_query(
            "INSERT INTO Product (sku, title, artist, description, format, price) VALUES (%s, %s, %s, %s, %s, %s);",
            sku, title, artist.id, description, format.name, price
        ).get_insert_id()
        slef.__id = new_id
        slef.sku = sku
        slef.title = title
        slef.artist = artist
        slef.description = description
        slef.format = format
        slef.price = price

    @overload
    def __init__(slef, db: SomeDbAbstraction, id: int) -> None:
        """
        Overload for retrieving a product by its primary key.
        
        Fetches the record from the database and initializes the Product instance.
        """
        record = db.run_query("SELECT * FROM Product WHERE id = %s;", id)
        slef.__id = record.get("id")
        slef.sku = record.get("sku")
        slef.title = record.get("title")
        slef.artist = Artist.from_id(db, record.get("artist"))
        slef.description = record.get("description")
        slef.format = Format(record.get("format"))
        slef.price = record.get("price")

    @overload
    def __init__(slef, db: SomeDbAbstraction, sku: str) -> None:
        """
        Overload for retrieving a product by SKU.
        
        Fetches the record from the database and initializes the Product instance.
        """
        record = db.run_query("SELECT * FROM Product WHERE sku = %s;", sku)
        slef.__id = record.get("id")
        slef.sku = record.get("sku")
        slef.title = record.get("title")
        slef.artist = Artist.from_id(db, record.get("artist"))
        slef.description = record.get("description")
        slef.format = Format(record.get("format"))
        slef.price = record.get("price")


Some thoughts on typing, the "overload pattern" and the current state of dynamic dispatch in python.

The previous version of this library also recieved this review, which raised a lot of valid points against the pattern, and in consequence, the existence of this library.

Amongst other arguments, the review pointed out:

  1. The library imposes a runtime cost;
  2. The pattern imposes debugging cost;
  3. The library goes against some pythonic principles, in particular: explicit over implicit, plow-ahead duck typing.

The primary indication was that this shouldn't be done. The review proposed two alternatives:

  1. Without using this library, plenty of existing code still uses signature overloading via *args, **kwargs. This can be annotated with typing.overload. However, this still lends itself to more complexity than is necessary.
  2. Just... write different functions with different names. No run-time type checking, no metaclasses, no decorators, no overloads. Add type hints for each signature, and if appropriate make those thin wrapper functions around a core logic function.

I'd like to state the case for the pattern:

Typed python
I take Python's dynamic typing, and particularly the duck-typing philosophy to be one of the language's most important features. The capicity to rely in the intrepreter's ability to just plow ahead makes python developing a fast, dynamic and rich experience, that favours rapid iteration. However, dynamic typing comes at a cost. For starters statically analyzing the correctness of a program becomes order of magnitude more difficult, it is very difficult for a programmer to handle the mental model of all potential runtime types a given codebase is managing, at each step of the branching controlflow for all posible states of the program. Both Python Software Foundation and in the general the broader community have for years taken notice of this trade off, and introduce a pletora of language features, libraries and tooling to address it. Python codebases, do in fact benefit from the existence of type hints, type checkers, runtime reflection libraries, &c. and a lot of typed python projects sooner or later encounter some use case(s) for polymorphic functions and methods. Overloading is used. A lot.

Current state of python overloading
I argue the overloading pattern is a bridge between Pythons duck type philosophy and type hints: it re-introduces the ability to "not care that much about types" at the calling site and facilitates "plow ahead" approach while still allowing the developer to reap the benefits of types. Non the less, the support for the pattern is bad. It is true that a lot of libraries expose APIs that are "overloaded beyond recognition" so that it's often impossible to determine what they do and don't support without guessing. This is a problem with the pattern's implementation, not the pattern itself. That is: typing.overload is broken, but that shouldn't mean a better implementation of dynamic function distpatch couldn't be tryed.

As I see it, the greatest problem with typing.overload is that it lies. It offers hints for multiple signatures but makes no guaratees about the implementation that actually has to handle them. Indeed calling typing.overloaded code often leads to a lot of effort trying to understand heavily branched ad hoc type checks with little context inside catch-all actual implementations that can, and often do, simply not handle the set of cases their signatures say they should. singledispatch trys to address this, but it ultimatelly works for a very narrow set of use cases, and introduces a typing sintax that differs from the already stablished type-hint syntax.

So...
Typed Python isn't going anywhere, and within typed codebases the pattern offers a lot. Evidence of that can be found in the fact that even with the current state of support, overloading is widely used both in the standard and in popular 3rd party libraries.

The pattern would benefit from type correctness assurances, simplified overload definition, a consistant set of rules for dispatch resolution (in oposition to ad hoc type checks) and a better debugging experience.


[1] Note: both the library and it's documentation are written in spanish, this post presents only a translation.
[2] Note: overload is an alias for sobrecargar baked into the library

\$\endgroup\$
6
  • 1
    \$\begingroup\$ I appreciate the motivation, e.g. "Overloading is used. A lot.", I just don't agree with it. // Thank you for the process(42) example. Of course I would write value: int | str in the signature, which is a 3rd type, one that str() already accepts. That's what the f-string ultimately calls. It's common to e.g. accept file: Path | str and unconditionally assign filepath = Path(file) to collapse the type. Similarly with forcing list of list of float to a numpy NDArray. New code, linted with mypy --strict, does this cleanly. // @beartype solves the dynamic "enforce at runtime" problem. \$\endgroup\$
    – J_H
    Commented Mar 23 at 18:24
  • \$\begingroup\$ @J_H. I agree that using Union or | is better whenever "plow ahead" duck-typing, and type cohersion is the only requirement. It is, definetly, better. Overloading pattern exists whenever some portions of behaviour of the program needs to switch on some type, r when type casting isn't as strait forward as "treat this as T", but processing is needed. Some times you don't have the exact data you need, but all info. releveant to construct it. Polymorphism is an useful tool. \$\endgroup\$
    – HernanATN
    Commented Mar 23 at 23:20
  • \$\begingroup\$ @J_H Just to drive the point home, even for that very very simple toy example, Union implementation equivalent to @sobrecargar would look like: def process(value: str | int): if not isinstance(value,(str,int)): raise TypeError("...") print(f"Processing a {'string' if isinstance(str) else 'integer'}: {value}") I haven't looked into @beartype, I'll research it. Thanks! \$\endgroup\$
    – HernanATN
    Commented Mar 23 at 23:22
  • \$\begingroup\$ Also note that pathlib.PurePath (base for pathlib.Path) has two runtime type checks and an implicit type check in its __init__ method. First checks if its behaving as "copy constructor" isinstance(arg, PurePath) then an implicit check for a os.Pathlike (trying to call fspath on arg) and then another explicit check for str. Thus: 1. Type checking overhead is present in that "collapse" (filepath = Path(file)), just down the stack call; 2. That "collapse" can throw so func can throw in the middle; 3. That "collapse" is more permissive than the func signature Path | str \$\endgroup\$
    – HernanATN
    Commented Mar 24 at 0:50
  • 1
    \$\begingroup\$ I routinely rely on mypy --strict plus pyright ., which largely do the same thing but each offers a few non-overlapping advantages over the other. Often it works out just fine, linting clean. Sometimes it does not, typically when calling into some crazy pypi library. In which case I occasionally find @beartype annotations worthwhile, as the runtime diagnostic messages are very helpful. Elapsed time overhead is typically pretty low, so having done that I might leave the decorators in, or might choose to remove them. On the whole: very helpful. \$\endgroup\$
    – J_H
    Commented Mar 24 at 4:47

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.