248

What must I do to use my objects of a custom type as keys in a Python dictionary (where I don't want the "object id" to act as the key) , e.g.

class MyThing:
    def __init__(self,name,location,length):
            self.name = name
            self.location = location
            self.length = length

I'd want to use MyThing's as keys that are considered the same if name and location are the same. From C#/Java I'm used to having to override and provide an equals and hashcode method, and promise not to mutate anything the hashcode depends on.

What must I do in Python to accomplish this ? Should I even ?

(In a simple case, like here, perhaps it'd be better to just place a (name,location) tuple as key - but consider I'd want the key to be an object)

3
  • What's wrong with using the hash? Commented Feb 4, 2011 at 18:55
  • 8
    Probably because he wants two MyThing, if they have the same name and location, to index the dictionary to return the same value, even if they were created separately as two different "objects".
    – Santa
    Commented Feb 4, 2011 at 19:00
  • 2
    "perhaps it'd be better to just place a (name,location) tuple as key - but consider I'd want the key to be an object)" You mean: a NON-COMPOSITE object ?
    – eyquem
    Commented Feb 4, 2011 at 21:20

6 Answers 6

293

You need to add 2 methods, note __hash__ and __eq__:

class MyThing:
    def __init__(self,name,location,length):
        self.name = name
        self.location = location
        self.length = length

    def __hash__(self):
        return hash((self.name, self.location))

    def __eq__(self, other):
        return (self.name, self.location) == (other.name, other.location)

    def __ne__(self, other):
        # Not strictly necessary, but to avoid having both x==y and x!=y
        # True at the same time
        return not(self == other)

The Python dict documentation defines these requirements on key objects, i.e. they must be hashable.

17
  • 22
    hash(self.name) looks nicer than self.name.__hash__(), and if you do and you can do hash((x, y)) to avoid XORing yourself. Commented Feb 4, 2011 at 19:02
  • 5
    As an additional note, I just discovered that calling x.__hash__() like that is also wrong, because it can produce incorrect results: pastebin.com/C9fSH7eF Commented Feb 4, 2011 at 20:32
  • @Rosh Oxymoron: thank you for the comment. When writing I was using explicit and for __eq__ but then I thought "why not using tuples?" because I often do that anyway (I think it's more readable). For some strange reason my eyes didn't go back to question about __hash__ however.
    – 6502
    Commented Feb 4, 2011 at 20:39
  • 2
    @user877329: are you trying to use some blender data structure as keys? Apparently from some repos certain objects require you to "freeze" them first to avoid mutability (mutating a value-based object that has been used as a key in a python dictionary is not permitted)
    – 6502
    Commented Apr 24, 2016 at 7:22
  • 1
    @kawing-chiu pythonfiddle.com/eq-method-needs-ne-method <-- this shows the "bug" in Python 2. Python 3 does not have this problem: the default __ne__() has been "fixed".
    – Bob Stein
    Commented Jun 8, 2016 at 12:39
39

An alternative in Python 2.6 or above is to use collections.namedtuple() -- it saves you writing any special methods:

from collections import namedtuple
MyThingBase = namedtuple("MyThingBase", ["name", "location"])
class MyThing(MyThingBase):
    def __new__(cls, name, location, length):
        obj = MyThingBase.__new__(cls, name, location)
        obj.length = length
        return obj

a = MyThing("a", "here", 10)
b = MyThing("a", "here", 20)
c = MyThing("c", "there", 10)
a == b
# True
hash(a) == hash(b)
# True
a == c
# False
23

You override __hash__ if you want special hash-semantics, and __cmp__ or __eq__ in order to make your class usable as a key. Objects who compare equal need to have the same hash value.

Python expects __hash__ to return an integer, returning Banana() is not recommended :)

User defined classes have __hash__ by default that calls id(self), as you noted.

There is some extra tips from the documentation.:

Classes which inherit a __hash__() method from a parent class but change the meaning of __cmp__() or __eq__() such that the hash value returned is no longer appropriate (e.g. by switching to a value-based concept of equality instead of the default identity based equality) can explicitly flag themselves as being unhashable by setting __hash__ = None in the class definition. Doing so means that not only will instances of the class raise an appropriate TypeError when a program attempts to retrieve their hash value, but they will also be correctly identified as unhashable when checking isinstance(obj, collections.Hashable) (unlike classes which define their own __hash__() to explicitly raise TypeError).

3
  • 3
    The hash alone is not enough, additionally you either need to override __eq__ or __cmp__.
    – Oben Sonne
    Commented Feb 4, 2011 at 18:59
  • @Oben Sonne: __cmp__ is given to you by Python if it is a user defined class, but you probably want to override them anyway to accommodate for new semantics.
    – Skurmedel
    Commented Feb 4, 2011 at 19:00
  • 2
    @Skurmedel: Yes, but although you can call cmp and use = on user classes which do not override these methods, one of them must be implemented to meet the questioner's requirement that instances with similar name and location have the same dictionary key.
    – Oben Sonne
    Commented Feb 4, 2011 at 19:08
15

I noticed in python 3.8.8 (maybe ever earlier) you don't need anymore explicitly declare __eq__() and __hash__() to have to opportunity to use your own class as a key in dict.

class Apple:
    def __init__(self, weight):
        self.weight = weight
        
    def __repr__(self):
        return f'Apple({self.weight})'

apple_a = Apple(1)
apple_b = Apple(1)
apple_c = Apple(2)

apple_dictionary = {apple_a : 3, apple_b : 4, apple_c : 5}

print(apple_dictionary[apple_a])  # 3
print(apple_dictionary)  # {Apple(1): 3, Apple(1): 4, Apple(2): 5}

I assume from some time Python manages it on its own, however I can be wrong.

1
  • 2
    I found this to be true also. Commented Nov 25, 2021 at 19:39
2

The answer for today as I know other people may end up here like me, is to use dataclasses in python >3.7. It has both hash and eq functions.

1
  • Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
    – Community Bot
    Commented Sep 24, 2021 at 10:54
2

@dataclass(frozen=True) example (Python 3.7)

@dataclass had been previously mentioned at: https://stackoverflow.com/a/69313714/895245 but here's an example.

This awesome new feature, among other good things, automatically defines a __hash__ and __eq__ method for you, making it just work as usually expected in dicts and sets:

dataclass_cheat.py

from dataclasses import dataclass, FrozenInstanceError

@dataclass(frozen=True)
class MyClass1:
    n: int
    s: str

@dataclass(frozen=True)
class MyClass2:
    n: int
    my_class_1: MyClass1

d = {}
d[MyClass1(n=1, s='a')] = 1
d[MyClass1(n=2, s='a')] = 2
d[MyClass1(n=2, s='b')] = 3
d[MyClass2(n=1, my_class_1=MyClass1(n=1, s='a'))] = 4
d[MyClass2(n=2, my_class_1=MyClass1(n=1, s='a'))] = 5
d[MyClass2(n=2, my_class_1=MyClass1(n=2, s='a'))] = 6

assert d[MyClass1(n=1, s='a')] == 1
assert d[MyClass1(n=2, s='a')] == 2
assert d[MyClass1(n=2, s='b')] == 3
assert d[MyClass2(n=1, my_class_1=MyClass1(n=1, s='a'))] == 4
assert d[MyClass2(n=2, my_class_1=MyClass1(n=1, s='a'))] == 5
assert d[MyClass2(n=2, my_class_1=MyClass1(n=2, s='a'))] == 6

# Due to `frozen=True`
o = MyClass1(n=1, s='a')
try:
    o.n = 2
except FrozenInstanceError as e:
    pass
else:
    raise 'error'

As we can see in this example, the hashes are being calculated based on the contents of the objects, and not simply on the addresses of instances. This is why something like:

d = {}
d[MyClass1(n=1, s='a')] = 1
assert d[MyClass1(n=1, s='a')] == 1

works even though the second MyClass1(n=1, s='a') is a completely different instance from the first with a different address.

frozen=True is mandatory, otherwise the class is not hashable, otherwise it would make it possible for users to inadvertently make containers inconsistent by modifying objects after they are used as keys. Further documentation: https://docs.python.org/3/library/dataclasses.html

Tested on Python 3.10.7, Ubuntu 22.10.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.