Skip to main content
0 votes
0 answers
40 views

Torch gradient estimates disagreeing with analytic and perturbation approximated gradients

I'm faced with a problem where as the title says I'm having trouble with the torch package's built in automatic differentiation algorithms (or my usage?). I think it was meant to be used on mini-...
Nomi Mino's user avatar
1 vote
0 answers
14 views

Matlab Reinforcement Learning, Issue with obtaining gradient from Qvalue critic using dlfeval,dl​gradient,d​larrays

I'm trying to implement a custom agent, and inside my agent I'm running into issues with obtaining the gradient of the Q value with respect to my actor network parameters. I have my code below, main ...
Sliferslacker's user avatar
0 votes
0 answers
18 views

Deterministic minimization of a stochastic function with subgradient method

Problem: I have implemented several step-size strategies (classic, Polyak, and Adagrad), but my subgradient algorithm either diverges or fails to converge. Initially, I focused on the problem: Initial ...
Titouan Brochard's user avatar
2 votes
2 answers
73 views

SIR parameter estimation with gradient descent and autograd

I am trying to apply a very simple parameter estimation of a SIR model using a gradient descent algorithm. I am using the package autograd since the audience (this is for a sort of workshop for ...
Alonso Ogueda Oliva's user avatar
1 vote
1 answer
35 views

theta values for gradient descent not coherent

i made a gradient descent code but it doesnt seem to work well import numpy as np from random import randint,random import matplotlib . pyplot as plt def calculh(theta, X): h = 0 h+=theta[0]*X ...
ismail rachid's user avatar
0 votes
0 answers
59 views

Gradient descent 3D visualization Python

I've recently implemented a neural network from scratch and am now focusing on visualizing the optimization process. Specifically, I'm interested in creating a 3D visualization of the loss landscape ...
Kris's user avatar
  • 59
0 votes
0 answers
35 views

How to specify gradient computation path in a neural network in pytorch

I want to implement a neural network on pytorch where gradients are not computed over all the weights. Let's say for example I have an MLP with three layers and I want half of the nodes in the last ...
danix's user avatar
  • 145
0 votes
1 answer
15 views

Global minimum as a starting point of Gradient Descent

If I already have the Global Minimum value for the Cost function of any model (including large language models) - would it facilitate Gradient Descent calculation? (suppose I have a quick way to ...
Drout's user avatar
  • 345
0 votes
1 answer
46 views

Problem in Backpropagation through a sample in Beta distribution in pytorch

Say I have obtained some alphas and betas as parameters from a neural network, which will be parameters of the Beta distribution. Now, I sample from the Beta distribution and then calculate some loss ...
Jimut123's user avatar
  • 538
0 votes
0 answers
27 views

Issues when minimizing cost function in a simple linear regression

I'm quite new to ML and I'm trying to do a linear regression with quite a simple dataset: text I did two different regression, one by hand and the other one using sci kit learn, where in the latter I ...
MIKEL LASS's user avatar
0 votes
1 answer
37 views

Linear regression model barely optimizes the intercept b

I've programmed a linear regression model from scratch. I use the "Sum of squared residuals" as the loss function for gradient descent. For testing I use linear data (y=x) When running the ...
Blacklight's user avatar
0 votes
0 answers
31 views

Autograd on a specific layer’s parameters

I am trying to get the Jacobian matrix of a specific layer's parameters. The below is my network model and i apply functional_call on it. def fm(params, input): return functional_call(self.model, ...
Klae zhou's user avatar
0 votes
0 answers
19 views

Do we plug in the old values or the new values during the gradient descent update?

I have a scenario when I am trying to optimize a vector of D dimensions. Every component of the vector is dependent on other components according to a function such as: summation over (i,j): (1-e(x_i)(...
Darkmoon Chief's user avatar
0 votes
0 answers
27 views

ValueError: One or more gradients are None, meaning no gradients are flowing

I'm trying to train a model I tried to wrote reading this paper: A lightweight model using frequency, trend and temporal attention for long sequence time-series prediction Now, during the training I ...
Giacomo Golino's user avatar
-1 votes
1 answer
61 views

What is wrong with my gradient descent implementation (SVM classifier with hinge loss)

I am trying to implement and train an SVM multi-class classifier from scratch using python and numpy in jupyter notebooks. I have been using the CS231n course as my base of knowledge, especially this ...
ho88it's user avatar
  • 21

15 30 50 per page
1
2 3 4 5
99