Perceptron naive implementation in Numpy drastically diverges from its list implementation

Ask Question

Asked 6 months ago

Modified 6 months ago

Viewed 46 times

While making a simple Machine Learning project, I've decided to rework a piece of list logic using numpy arrays. Just changed numerical incrementation in loops to numpy arithmetics. However, numpy implementation gives decision boundaries far from those of the loop implementations, behaves chaotically, and doesn't converge.

Complete Working Example can be found at Google Colab here: https://colab.research.google.com/drive/1zLy0oTidhm2lrwkASgT1lpsI9yfrBPvx?usp=sharing

It really bothers me. If it's a numpy precision error, on larger projects it might not be so obvious and thwart the results. If it's a conceptual error, I can't see it. This question most probably looks naive, but I genuinely want to learn what went wrong.

Consider the following list implementation of Perceptron training. The logic behind it is explained further down, but most importantly, see how loop num value increments for W and b are nothing special.

def perceptronStep(X, y, W, b, learn_rate = 0.01):
    diff = []
    for i in range(len(X)):
        y_hat = prediction(X[i],W,b)[0]
        # diff=0 when prediction is correct
        # diff=1 or diff=-1 show direction W and b change
        diff.append(y[i] - y_hat)
        dif = diff[i]
        W[0] += dif * X[i][0]*learn_rate
        W[1] += dif * X[i][1]*learn_rate
        b += dif * learn_rate
    return W, b, diff

As I see it, the following numpy implementation should perfectly recreate how perceptronStep behaves.

def np_perceptronStep(X, y, W, b, learn_rate = 0.01):
    Y_hat = np.squeeze(prediction(X, W, b))
    diff = y - Y_hat
    sumX = np.sum(X * diff[..., np.newaxis], axis=0, keepdims=True)
    W += learn_rate * sumX.T        # W_np
    b += learn_rate * np.sum(diff)  # b_np
    return W, b, diff

Yet it doesn't. W_np and b_np are different from W and b, by 1.0e-16s from epoch 0 and by 10s at the end of training. b_np jumps all over the place, while b shows that it should quickly settle down. Chaotic.

Prediction is obtained via step function.

def prediction(X, W, b):
    Y_hat = np.matmul(X,W)+b
    # step function
    Y_hat[Y_hat >= 0] = 1
    Y_hat[Y_hat < 0] = 0
    return Y_hat

Context

I go through Intro to ML with PyTorch on Udacity. A crude trick is introduces before teaching the gradient descent. Consider a Perceptron holding a linear classifier Wx+b, in this case w1x1 + w2x2 + b, where two possible classes are 0 and 1. Taking the difference between true and predicted class for point x' to be c, the Perceptron is updated like this: W + acx' and b + ac where a is the learning rate. As simple as it gets.

I've tried manually ship-of-theseus transition from loops to np inside np_perceptronStep, hoping that one thing was causing problems. I've tried different combinations of determining diff, W and b - either in a loop or with np. I've tracked how differences between W and b of two implementations change. They changed, but np_perceptronStep never came close to the perceptronStep results.

asked Oct 15, 2024 at 15:14

Skopyk

338 bronze badges

1

The two are actually very different. In the list version, you update the parameters for every data point, changing the prediction for the next data point. In the np version you update once for the entire dataset.
– xdurch0
Commented Oct 16, 2024 at 7:42
@xdurch0 oh no. In the list version I overlooked that y_hat appears to be using the initial W for all points, when in actuality W[0] and W[1] are rewritten on each iteration.
– Skopyk
Commented Oct 17, 2024 at 10:28
I am very grateful for your response. I was stuck with this problem for an indecent amount of time
– Skopyk
Commented Oct 17, 2024 at 10:31
@xdurch0 Do you happen to know what is the right thing to do now? The question doesn't uncover any np imprecisions, and my error seems too specific to help anyone else. Do I delete the question, or wait for an answer, or write it myself and credit you?
– Skopyk
Commented Oct 17, 2024 at 10:32

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

Perceptron naive implementation in Numpy drastically diverges from its list implementation

0

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.