Problems with Linear Regression Program

Question

I'm trying to use a linear regression program to predict handwritten numbers using the mnist dataset. Whenever I have tried running it, the gradient descent function always takes a while to work and it is taking a long time to approach the correct weights. In eight hours it has gone through the function 550 times and there is still a lot of error. Can someone tell me if it normally takes this long, or if I am doing something wrong.

import numpy as np
import pandas as pd

mnist = pd.read_csv('mnist_train.csv')[:4200]
x = np.array(mnist)[:4200,1:]
y = np.array(mnist)[:4200,0].reshape(4200,1)

#How many numbers in dataset
n = len(x)
#How many values in each number
n1 = len(x[0])

#sets all weights equal to 1
coef = np.array([1 for i in range(n1)])

epochs = 1000000000000
learning_rate = .000000000008999
for i in range(epochs):
    cur_y = sum(x*coef)
    error = y-cur_y
    #Calculates Gradient
    grad = (np.array([sum(sum([-2/n  * (error)* x[j,i] for j in range(n)])) for i in range(n1)]))
    #Updates Weights
    coef = (-learning_rate * grad) + coef
    print(i)
    print(sum(y-(x*coef)))

"Can someone tell me if it normally takes this long" - How long does it take? — Blorgbeard, Commented Oct 6, 2018 at 18:48
I set the epochs to 1000000000000 and had it run for eight hours to see how many it would go through and it went through about 550 iterations. The error got smaller, but it still was not accurate at all. — john, Commented Oct 6, 2018 at 19:07
at a guess, you probably want to arrange your code such that loops with more than a ~hundred iterations are vectorised inside numpy — Sam Mason, Commented Oct 6, 2018 at 19:40
I think you set learning rate too small so that you had large error after reasonable number of loops. — ipramusinto, Commented Oct 6, 2018 at 20:22

Kurtis Streutker · Accepted Answer · 2018-10-06 21:48:38Z

1

Your learning rate is extremely tiny. Also, 784 is a lot of dimensions for linear regression to tackle, especially assuming you're using all 60,000 samples. An SVM would work better and obviously, a CNN would be best.

Given your error is getting smaller I would recommend increasing your learning rate and train using stochastic gradients (grabbing random batches from your training set for each epoch instead of the whole training set).

answered Oct 6, 2018 at 21:48

Kurtis Streutker

1,32413 silver badges14 bronze badges

Thanks! I'll try using that.
– john
Commented Oct 6, 2018 at 23:53

Add a comment |

Collectives™ on Stack Overflow

Problems with Linear Regression Program

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related