Python beginner, understanding some code

Question

Here is a Python representation of a Neural Network Neuron that I'm trying to understand

class Network(object):

    def __init__(self, sizes):
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x) 
                        for x, y in zip(sizes[:-1], sizes[1:])]

Here is my current understanding :

self.num_layers = len(sizes): Return the number of items in sizes
self.sizes = sizes: assign self instance sizes to function parameter sizes
self.biases = sizes: generate an array of elements from the standard normal distribution (indicated by np.random.randn(y, 1))

What is the following line computing?

self.weights = [np.random.randn(y, x)
    for x, y in zip(sizes[:-1], sizes[1:])]

I'm new to Python. Can this code be used within a Python shell so I can gain a better understanding by invoking each line separately ?

Wow, I came looking for the same answer while looking at exactly the same code, what are the odds! Am guessing this is a pretty standard python function — frankelot, Commented Oct 22, 2017 at 20:27

Martijn Pieters · Accepted Answer · 2016-03-04 22:35:19Z

The zip() function pairs up elements from each iterable; zip('foo', 'bar') for example, would produce [('f', 'b'), ('o', 'a'), ('o', 'r')]; each element in the two strings has been paired up into three new tuples.

zip(sizes[:-1], sizes[1:]) then, creates pairs of elements in the sequence sizes with the next element, because you pair up all elements except the last (sizes[:-1]) with all elements except the first (sizes[1:]). This pairs up the first and second element together, then the second and third, etc. all the way to the last two elements.

For each such pair a random sample is produced, using a list comprehension. So for each x, y pair, a new 2-dimensional numpy matrix is produced with random values divided over y rows and x columns.

Note that the biases value only uses sizes[1:], all but the first, to produce y-by-1 matrices for each such size.

Quick demo of these concepts:

>>> zip('foo', 'bar')
[('f', 'b'), ('o', 'a'), ('o', 'r')]
>>> zip('foo', 'bar', 'baz')  # you can add more sequences
[('f', 'b', 'b'), ('o', 'a', 'a'), ('o', 'r', 'z')]
>>> sizes = [5, 12, 18, 23, 42]
>>> zip(sizes[:-1], sizes[1:])  # a sliding window of pairs
[(5, 12), (12, 18), (18, 23), (23, 42)]
# 0, 1 ..  1,  2 ..  2,  3 ..  3,  4   element indices into sizes
>>>

Alexis Clarembeau · Accepted Answer · 2016-03-04 22:30:09Z

self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] will call the randn function with the parameters x, y that are the results of the operation zip(sizes[:-1], sizes[1:])

If we consider a list l=[1, 2, 3, 4] l[:-1] will return [1, 2, 3] and l[1] will give [2, 3, 4] The zip operation on l[:-1], l[1] will make the pairs [(1, 2), (2, 3), (3, 4)]. Then, the pairs will be transmitted to the randn function

Of course, you can always type code in a python shell, it will give you a better understanding ;)

zondo · Accepted Answer · 2016-03-04 22:32:01Z

That is what is called list comprehension. You can create the same effect if you use a normal for loop:

self.weights = []
for x, y in zip(sizes[:-1], sizes[1:]):
    self.weights.append(np.random.randn(y, x))

Now with that loop, you can see that self.weights is really just a bunch of np.random.randn(y, x)'s where y and x are defined for each x and y in zip(sizes[:-1], sizes[1:]). You can just say that to yourself as you read the list comprehension: self.weights = [np.random.randn(y, x)) for x, y in zip(sizes[:-1], sizes[1:])]. The word order finally makes sense. In case you didn't know, zip is a function that returns a list of tuples of each corresponding element in its arguments. For example, zip([1, 2, 3, 4], [4, 3, 2, 1]) would return [(1, 4), (2, 3), (3, 2), (4, 1)]. (In Python3, actually it's a generator of tuples)

Zac Schulwolf · Accepted Answer · 2016-12-29 07:53:34Z

If you know C++, here is a conversion for self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] to C++ that I made. It uses the Eigen C++ library instead of the Numpy Python library. You call it by typing Weights(weights, sizes); in main(). The parameters for the function Weights consist of a pass by reference list of Matrices (weights) and a Vector (sizes). Pass by reference, marked by the '&', basically means that the value of weights will change in both the function and the main loop. This is different than pass by value because pass by value will only change the value of weights in the function. If you are trying to completely replicate this you will need to type #include <list>, #include<Eigen/Dense>, using namespace std; and using namespace Eigen;.

void Weights(list<MatrixXd> &weights, VectorXi sizes){ 
    int x,y; 
    for(int i=0; i < sizes.rows()-1;i++){
        y=sizes[i+1]; //sizes[1:]
        x=sizes[i]; //sizes[:-1]
        weights.push_back(MatrixXd::Random(y,x)); //np.random.randn(y,x)
    }
}

aditya sai · Accepted Answer · 2016-07-26 11:47:16Z

This actually creates two random variables x and y. one for the connections from 1st layer to the 2nd layer of neurons. and other for the 2nd layer of neurons to output layer of neurons. sizes(-1) means all the connections in the vector except last one, which is 1st layer to hidden layer weights. and sizes(1) is all the connections except 1st element in the vector. This is weights of the connections from hidden to the output layer.

Note : The connections or tuples are formed with the function ZIP.

Collectives™ on Stack Overflow

Python beginner, understanding some code

5 Answers 5

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Related