Implementing a perceptron using numpy [closed]

Question

Closed. This question is not about programming or software development. It is not currently accepting answers.

This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.

Closed 2 months ago.

Improve this question

I'm trying to implement a perceptron in python using numpy, when using the notation z=XW+b everything is fine. While studying ML I do see though that z=WX+b is also common, especially when talking about neural networks. The problem is that the dimensions of the matrices don't add up, I tried following some answers on the web but the output doesn't have the right dimensions. I tried also asking chatgpt but it only implemented the code following the z=XW+b notation. This is the code I used for z=XW+b:

import numpy as np

n_inpts = 10
in_feats = 5
n_hidden = 8
out_feats = 1

X = np.random.randn(n_inpts,in_feats)

W_x = np.random.randn(in_feats, n_hidden)

bias_h  = np.random.randn(1, n_hidden)
H = np.dot(X,W_x) + bias_h
#H is nxh

relu = lambda x: max(0, x)
v_relu = np.vectorize(relu)

H = v_relu(H)

W_h = np.random.randn(n_hidden, out_feats)

bias_o = np.random.randn(1, out_feats)

output = np.dot(H, W_h) + bias_o

Can anybody give me an implementation that gives the same result while using z=WX+b? Every single implementation I found follows the z=XW+b notation. I guess it comes down to how you specify the X and W matrices but as of now I have had no luck finding a solution to my question

Try and go through the matrix multiplication by hand on paper. You have to actually go through it to get a feel of things. — Sean, Commented Feb 16 at 11:34
In a matrix multiplication C = AB, the number of columns in A need to match the number of rows in B. in_feats is what's common between W and X, so it must be W's column and X's rows. i.e. WX = (n_hidden, in_feats)(in_feats, n_inpts) = (n_hidden, n_inpts). — Mercury, Commented Feb 17 at 15:49

Ignatius Reilly · Accepted Answer · 2025-02-16 07:04:59Z

Just transpose everything!

In the example I transpose each W and X to get the same distribution of random numbers, so you can compare the result. But you would normally define each variable already transposed as in the commented lines.

import numpy as np

np.random.seed(42) # for reproducible results
n_inpts = 10
in_feats = 5
n_hidden = 8
out_feats = 1

X = np.random.randn(n_inpts, in_feats).T
# X = np.random.randn(in_feats, n_inpts)

W_x = np.random.randn(in_feats, n_hidden).T
# W_x = np.random.randn(n_hidden, in_feats)

bias_h  = np.random.randn(n_hidden, 1) # column vector
H = np.dot(W_x, X) + bias_h
#H is nxh

relu = lambda x: max(0, x)
v_relu = np.vectorize(relu)

H = v_relu(H)

W_h = np.random.randn(n_hidden, out_feats).T
# W_h = np.random.randn(out_feats, n_hidden)

bias_o = np.random.randn(out_feats, 1)

output = np.dot(W_h, H) + bias_o
print(output)

Some people like to think the "default" vector as a column. So if you start thinking your perceptron/network from the point of view of one single input example, X is going to have shape (features, 1), the same as the bias. And later observations are added as more columns. It can be pedagogical, but the resulting X is a transposed version of the classic "spreadsheet" representation, with one observation per row and one feature per column.

Collectives™ on Stack Overflow

Implementing a perceptron using numpy [closed]

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related