Using lambda and defaultdict

Question

I was reading about the collection defaultdict and came across these lines of code:

import collections
tree = lambda: collections.defaultdict(tree)
some_dict = tree()
some_dict['colours']['favourite'] = "yellow"

I understand that lamba takes a variable and performs some function on it. I've seen lambda being used like this: lambda x: x + 3 In the second line of code above, what variable is lambda taking and what function is it carrying out?

I also understand that defaultdict can take parameters such as int or list. In the second line, defaultdict takes the parameter tree which is a variable. What is the significance of that?

It does not take a parameter, it is simply a function taking no parameters. — willeM_ Van Onsem
– willeM_ Van Onsem, Commented Jul 11, 2018 at 13:15

chepner · Accepted Answer · 2018-07-11 13:32:35Z

The code is roughly equivalent (ignoring metadata introduced by the def statement) to

import collections
def tree():
    return collections.defaultdict(tree)
some_dict = tree()
some_dict['colours']['favourite'] = "yellow"

The lambda expression simply defines a function of zero parameters, and the function is bound to the name tree.

Typically, you only use lambda expressions when you actually want an anonymous function, for example passing it as an argument to a another function, as in

sorted_list = sorted(some_list_of_tuples, key=lambda x: x[0])

It is considered better practice to use a def statement when you really want a named function.

defaultdict takes a callable to be used to produce a default value for a new key. int() returns 0, list() returns an empty list, and tree() returns a new defaultdict; all of them can be used as arguments to defaultdict. The recursive nature of defining tree to return a defaultdict using itself as the default-value generator means you can generate nested dicts to an arbitrary depth; each "leaf" dict is itself another defaultdict.

willeM_ Van Onsem · Accepted Answer · 2018-07-11 13:31:23Z

In the second line of code above, what variable is lambda taking and what function is it carrying out?

A lambda function is an anonymous (without name) function. So a lambda expression like:

tree = lambda: collections.defaultdict(tree)

is, except for some details (the fact that its __name__ attribute contains the name of the function, and not '<lambda>'), it is equivalent to:

def tree():
    return collectsions.defaultdict(tree)

The difference with a simple exression is thus that we here encode the computation in a function. We can never call it, call it once, or multiple times.

It also allows us to tie a knot. Notice that we pass a reference to the function (lambda expression) in the result. We thus have a function that construct a defaultdict with as factory the function itself. We can thus recursively construct subtrees.

I also understand that defaultdict can take parameters such as int or list. In the second line, defaultdict takes the parameter tree which is a variable. What is the significance of that?

The tree that we pass to the defaultdict is thus a reference to the lambda-expression we construct. It thus means that in case the defaultdict invokes the "factory". We get another defaultdict with as factory again the tree.

If we thus call some_dict['foo']['bar']['qux']. We thus have a defaultdict in a defaultdict in a defaultdict. All these defaultdicts have as factory the tree function. If we later construct extra children, these will again be a defaultdict with tree as constructor.

The list or int case is not special. If you invoke list (like list()), then you construct a new empty list. The same happens with int: if you call int(), you will obtain 0. The fact that this is a reference to a class object is irrelevant: the defaultdict does not take this into account (it does not know what the factory is, it only invokes it with no parameters).

Thank you very much for your detailed explanation. So the code above allows you to create an infinitely nested dictionary by creating a new defaultdict each time you add a layer?
@Anya: yes, but you do not create that immediately of course. It is more that each time you get an item, like some_dict[''foo'] and foo is missing, you invoke tree, which will thus construct again a defaultdict with tree as factory. But you can indeed walk arbitrary deep in this some_dict.

Collectives™ on Stack Overflow

Using lambda and defaultdict

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related