1,480 questions
0
votes
0
answers
40
views
Torch gradient estimates disagreeing with analytic and perturbation approximated gradients
I'm faced with a problem where as the title says I'm having trouble with the torch package's built in automatic differentiation algorithms (or my usage?). I think it was meant to be used on mini-...
1
vote
0
answers
14
views
Matlab Reinforcement Learning, Issue with obtaining gradient from Qvalue critic using dlfeval,dlgradient,dlarrays
I'm trying to implement a custom agent, and inside my agent I'm running into issues with obtaining the gradient of the Q value with respect to my actor network parameters. I have my code below, main ...
0
votes
0
answers
18
views
Deterministic minimization of a stochastic function with subgradient method
Problem: I have implemented several step-size strategies (classic, Polyak, and Adagrad), but my subgradient algorithm either diverges or fails to converge.
Initially, I focused on the problem:
Initial ...
2
votes
2
answers
73
views
SIR parameter estimation with gradient descent and autograd
I am trying to apply a very simple parameter estimation of a SIR model using a gradient descent algorithm. I am using the package autograd since the audience (this is for a sort of workshop for ...
1
vote
1
answer
35
views
theta values for gradient descent not coherent
i made a gradient descent code but it doesnt seem to work well
import numpy as np
from random import randint,random
import matplotlib . pyplot as plt
def calculh(theta, X):
h = 0
h+=theta[0]*X ...
0
votes
0
answers
59
views
Gradient descent 3D visualization Python
I've recently implemented a neural network from scratch and am now focusing on visualizing the optimization process. Specifically, I'm interested in creating a 3D visualization of the loss landscape ...
0
votes
0
answers
35
views
How to specify gradient computation path in a neural network in pytorch
I want to implement a neural network on pytorch where gradients are not computed over all the weights. Let's say for example I have an MLP with three layers and I want half of the nodes in the last ...
0
votes
1
answer
15
views
Global minimum as a starting point of Gradient Descent
If I already have the Global Minimum value for the Cost function of any model (including large language models) - would it facilitate Gradient Descent calculation?
(suppose I have a quick way to ...
0
votes
1
answer
46
views
Problem in Backpropagation through a sample in Beta distribution in pytorch
Say I have obtained some alphas and betas as parameters from a neural network, which will be parameters of the Beta distribution. Now, I sample from the Beta distribution and then calculate some loss ...
0
votes
0
answers
27
views
Issues when minimizing cost function in a simple linear regression
I'm quite new to ML and I'm trying to do a linear regression with quite a simple dataset: text
I did two different regression, one by hand and the other one using sci kit learn, where in the latter I ...
0
votes
1
answer
37
views
Linear regression model barely optimizes the intercept b
I've programmed a linear regression model from scratch. I use the "Sum of squared residuals" as the loss function for gradient descent. For testing I use linear data (y=x)
When running the ...
0
votes
0
answers
31
views
Autograd on a specific layer’s parameters
I am trying to get the Jacobian matrix of a specific layer's parameters. The below is my network model and i apply functional_call on it.
def fm(params, input):
return functional_call(self.model, ...
0
votes
0
answers
19
views
Do we plug in the old values or the new values during the gradient descent update?
I have a scenario when I am trying to optimize a vector of D dimensions. Every component of the vector is dependent on other components according to a function such as:
summation over (i,j):
(1-e(x_i)(...
0
votes
0
answers
27
views
ValueError: One or more gradients are None, meaning no gradients are flowing
I'm trying to train a model I tried to wrote reading this paper: A lightweight model using frequency, trend and temporal attention for long sequence time-series prediction
Now, during the training I ...
-1
votes
1
answer
61
views
What is wrong with my gradient descent implementation (SVM classifier with hinge loss)
I am trying to implement and train an SVM multi-class classifier from scratch using python and numpy in jupyter notebooks.
I have been using the CS231n course as my base of knowledge, especially this ...