I am supposed to implement gradient descent for linear regression. Here is the implementation:
class SimpleLinearRegressionModel():
def __init__(self, x, y, theta, alpha):
self.x = x
self.y = y
self.theta = theta
self.alpha = alpha
'''
Equation for the regression line.
input x_i (float) - single input feature
@return corresponding model output (float)
'''
def h(self, x_i):
return self.theta[0] + x_i * self.theta[1]
'''
Loss function measuring mean squared error of the regression line for a given training
set and model parameters.
@return MSE based on the current parameters (float)
'''
def J(self):
m = len(self.y)
return (1 / (m)) * np.sum((self.h(self.x) - self.y) ** 2)
def get_gradient(self):
m = len(self.y)
return np.array([(1 / m) * np.sum(self.h(self.x) - self.y), (1 / m) * np.sum((self.h(self.x) - self.y) * self.x)])
'''
Update the model parameters (i.e. the two theta values) for one gradient descent step.
'''
def gradient_descent_step(self):
return self.theta - self.alpha * self.get_gradient()
'''
Run gradient descent to optimize the model parameters.
@param threshold (float) - run gradient descent until the magnitude of the gradient is
below this value.
@return a list storing the value of the cost function after every step of gradient descent (float list)
'''
def run_gradient_descent(self, threshold=.01):
cost_values = []
while np.linalg.norm(self.get_gradient()) > threshold:
self.theta = self.gradient_descent_step()
cost_values.append(self.J())
return cost_values
This works for a small dataset (25 elements) but when using a large one (20000 elements), it becomes impossibly slow. How can I optimize this? I've tried to vectorize all the functions, but J()
and get_gradient()
seem especially slow. I've also noticed when debugging while using the large dataset that the error is increasing as the algorithm runs, which definitely should not be happening.