I'm trying to implement a perceptron in python using numpy, when using the notation z=XW+b everything is fine. While studying ML I do see though that z=WX+b is also common, especially when talking about neural networks. The problem is that the dimensions of the matrices don't add up, I tried following some answers on the web but the output doesn't have the right dimensions. I tried also asking chatgpt but it only implemented the code following the z=XW+b notation. This is the code I used for z=XW+b:
import numpy as np
n_inpts = 10
in_feats = 5
n_hidden = 8
out_feats = 1
X = np.random.randn(n_inpts,in_feats)
W_x = np.random.randn(in_feats, n_hidden)
bias_h = np.random.randn(1, n_hidden)
H = np.dot(X,W_x) + bias_h
#H is nxh
relu = lambda x: max(0, x)
v_relu = np.vectorize(relu)
H = v_relu(H)
W_h = np.random.randn(n_hidden, out_feats)
bias_o = np.random.randn(1, out_feats)
output = np.dot(H, W_h) + bias_o
Can anybody give me an implementation that gives the same result while using z=WX+b? Every single implementation I found follows the z=XW+b notation. I guess it comes down to how you specify the X and W matrices but as of now I have had no luck finding a solution to my question
in_feats
is what's common between W and X, so it must be W's column and X's rows. i.e. WX = (n_hidden, in_feats)(in_feats, n_inpts) = (n_hidden, n_inpts).