11

Here is a Python representation of a Neural Network Neuron that I'm trying to understand

class Network(object):

    def __init__(self, sizes):
        self.num_layers = len(sizes)
        self.sizes = sizes
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x) 
                        for x, y in zip(sizes[:-1], sizes[1:])]

Here is my current understanding :

  • self.num_layers = len(sizes): Return the number of items in sizes
  • self.sizes = sizes: assign self instance sizes to function parameter sizes
  • self.biases = sizes: generate an array of elements from the standard normal distribution (indicated by np.random.randn(y, 1))

What is the following line computing?

self.weights = [np.random.randn(y, x)
    for x, y in zip(sizes[:-1], sizes[1:])]

I'm new to Python. Can this code be used within a Python shell so I can gain a better understanding by invoking each line separately ?

1
  • 2
    Wow, I came looking for the same answer while looking at exactly the same code, what are the odds! Am guessing this is a pretty standard python function
    – frankelot
    Commented Oct 22, 2017 at 20:27

5 Answers 5

10

The zip() function pairs up elements from each iterable; zip('foo', 'bar') for example, would produce [('f', 'b'), ('o', 'a'), ('o', 'r')]; each element in the two strings has been paired up into three new tuples.

zip(sizes[:-1], sizes[1:]) then, creates pairs of elements in the sequence sizes with the next element, because you pair up all elements except the last (sizes[:-1]) with all elements except the first (sizes[1:]). This pairs up the first and second element together, then the second and third, etc. all the way to the last two elements.

For each such pair a random sample is produced, using a list comprehension. So for each x, y pair, a new 2-dimensional numpy matrix is produced with random values divided over y rows and x columns.

Note that the biases value only uses sizes[1:], all but the first, to produce y-by-1 matrices for each such size.

Quick demo of these concepts:

>>> zip('foo', 'bar')
[('f', 'b'), ('o', 'a'), ('o', 'r')]
>>> zip('foo', 'bar', 'baz')  # you can add more sequences
[('f', 'b', 'b'), ('o', 'a', 'a'), ('o', 'r', 'z')]
>>> sizes = [5, 12, 18, 23, 42]
>>> zip(sizes[:-1], sizes[1:])  # a sliding window of pairs
[(5, 12), (12, 18), (18, 23), (23, 42)]
# 0, 1 ..  1,  2 ..  2,  3 ..  3,  4   element indices into sizes
>>> 
1

self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] will call the randn function with the parameters x, y that are the results of the operation zip(sizes[:-1], sizes[1:])

If we consider a list l=[1, 2, 3, 4] l[:-1] will return [1, 2, 3] and l[1] will give [2, 3, 4] The zip operation on l[:-1], l[1] will make the pairs [(1, 2), (2, 3), (3, 4)]. Then, the pairs will be transmitted to the randn function

Of course, you can always type code in a python shell, it will give you a better understanding ;)

1

That is what is called list comprehension. You can create the same effect if you use a normal for loop:

self.weights = []
for x, y in zip(sizes[:-1], sizes[1:]):
    self.weights.append(np.random.randn(y, x))

Now with that loop, you can see that self.weights is really just a bunch of np.random.randn(y, x)'s where y and x are defined for each x and y in zip(sizes[:-1], sizes[1:]). You can just say that to yourself as you read the list comprehension: self.weights = [np.random.randn(y, x)) for x, y in zip(sizes[:-1], sizes[1:])]. The word order finally makes sense. In case you didn't know, zip is a function that returns a list of tuples of each corresponding element in its arguments. For example, zip([1, 2, 3, 4], [4, 3, 2, 1]) would return [(1, 4), (2, 3), (3, 2), (4, 1)]. (In Python3, actually it's a generator of tuples)

1

If you know C++, here is a conversion for self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] to C++ that I made. It uses the Eigen C++ library instead of the Numpy Python library. You call it by typing Weights(weights, sizes); in main(). The parameters for the function Weights consist of a pass by reference list of Matrices (weights) and a Vector (sizes). Pass by reference, marked by the '&', basically means that the value of weights will change in both the function and the main loop. This is different than pass by value because pass by value will only change the value of weights in the function. If you are trying to completely replicate this you will need to type #include <list>, #include<Eigen/Dense>, using namespace std; and using namespace Eigen;.

void Weights(list<MatrixXd> &weights, VectorXi sizes){ 
    int x,y; 
    for(int i=0; i < sizes.rows()-1;i++){
        y=sizes[i+1]; //sizes[1:]
        x=sizes[i]; //sizes[:-1]
        weights.push_back(MatrixXd::Random(y,x)); //np.random.randn(y,x)
    }
}
0

This actually creates two random variables x and y. one for the connections from 1st layer to the 2nd layer of neurons. and other for the 2nd layer of neurons to output layer of neurons. sizes(-1) means all the connections in the vector except last one, which is 1st layer to hidden layer weights. and sizes(1) is all the connections except 1st element in the vector. This is weights of the connections from hidden to the output layer.

Note : The connections or tuples are formed with the function ZIP.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.