Newest 'learning-rate' Questions

0 votes

1 answer

69 views

Tuning starting and final learning rate

If you use cosine decay for example and you have starting learning rate and final learning rate, can you tune those hyperparameters so that final learning rate is in some ratio of starting learning ...

ict

1

asked Jun 27, 2025 at 7:25

1 vote

1 answer

200 views

AttributeError When Updating Learning Rate in Keras Using K.set_value

I'm trying to update the learning rate of my Keras model dynamically during training. I'm using the following code: import tensorflow as tf from tensorflow.keras import backend as K model = keras....

codebysumit

85

asked Apr 1, 2025 at 0:53

1 vote

0 answers

23 views

Trying to implement ReduceLrOnPlateau in darknet (alexeyab's version) but having issues with it

i have been trying to implement a reducelronplateau scheduler which will allow tracking of loss history, and automatically update the learning rate whenever a loss plateau is detected. i have made ...

ririririri

29

asked Nov 28, 2024 at 8:55

2 votes

1 answer

2k views

learning rate in torch.optim.AdamW has no effect?

I am working on fine-tuning BLIP-2 on the RSICD dataset using LoRA. I am working on colab, using an A100. I am strangely finding that when I set the learning rate in the code below, it has no effect. ...

Paul

1,216

asked Oct 8, 2024 at 6:28

1 vote

0 answers

576 views

Optimal hyperparameters for fine tuning LLM

could I ask you for help? I am doing fine tuning of LLM model Llama3 8b (with LoRA) for text classification. I am using Trainer from Huggingface. I am looking for the optimal ...

Roman Frič

11

asked Jul 24, 2024 at 11:31

1 vote

0 answers

260 views

Optimal Learning Rate and Batch Size for LLM Training

What are the best practices for optimizing batch size and learning rate in training Large Language Models (LLMs)? How should these hyperparameters be adjusted relative to each other for efficient ...

Arman Asgharpoor

331

asked May 2, 2024 at 7:18

3 votes

1 answer

927 views

How to print learning rate per epoch with pytorch lightning?

I am having a problem with printing (logging) learning rate per epoch in pytorch lightning (PL). TensorFlow logs the learning rate at default. As PL guide suggested, I wrote the following code: class ...

Tae-Sung Shin

20.7k

asked Mar 1, 2024 at 17:39

2 votes

0 answers

76 views

tfjs: Adam learning rate decay

How do I train a Tensorflow model using Adam optimizer that decays learning rate during the trining with Tensorflow.JS! (not python) I cannot find that the library provides an exponential decay ...

Oleg K

5,326

asked Feb 29, 2024 at 13:32

1 vote

1 answer

999 views

PyTorch Lightning's ReduceLRonPlateau not working properly

I have been trying to write a lightning module using both a warmup and an annealing function ReduceLROnPlateau and something really odd is happening. If the program reduces the learning rate, the ...

GZinn

11

asked Feb 14, 2024 at 23:25

1 vote

3 answers

6k views

How to fix the learning-rate for Huggingface´s Trainer?

I'm training model with the following parameters: Seq2SeqTrainingArguments( output_dir = "./out", overwrite_output_dir = True, do_train ...

user3668129

4,912

asked Jan 10, 2024 at 9:14

2 votes

1 answer

551 views

Unusual Learning Rate Finder Curve: Loss Lowest at Smallest Learning Rate

I'm using PyTorch Lightning's LR Finder but am getting an atypical curve. The loss starts at its lowest point when the learning rate is at its smallest, increases until it plateaus, and then exhibits ...

keving

31

asked Sep 11, 2023 at 14:42

0 votes

2 answers

1k views

Pytorch Lightning Learning Rate Tuners Giving unexpected results

I'm trying to find an optimal learning rate using python pl.tuner.Tuner but results aren't as expected The model I am running is a linear classifier on top of a BertForSequenceClassification Automodel ...

Toby

1

asked Jul 28, 2023 at 14:18

5 votes

1 answer

5k views

Why do we multiply learning rate by gradient accumulation steps in PyTorch?

Loss functions in pytorch use "mean" reduction. So it means that the model gradient will have roughly the same magnitude given any batch size. It makes sense that you want to scale the ...

offchan

4,285

asked Mar 10, 2023 at 22:15

1 vote

1 answer

476 views

Getting rid of the clutter of `.lr_find_` in pytorch lightning?

When using the Lightning’s built-in LR finder: # Create a Tuner tuner = Tuner(trainer) # finds learning rate automatically # sets hparams.lr or hparams.learning_rate to that learning rate tuner....

Gabi Gubu

35

asked Feb 16, 2023 at 13:24

2 votes

1 answer

3k views

how MultiStepLR works in PyTorch

I'm new to PyTorch and am working on a toy example to understand how weight decay works in learning rate passed into the optimizer. When I use MultiStepLR , I was expecting to decrease the learning ...

whitepanda

488

asked Feb 10, 2023 at 15:21

Collectives™ on Stack Overflow

Tuning starting and final learning rate

AttributeError When Updating Learning Rate in Keras Using K.set_value

Trying to implement ReduceLrOnPlateau in darknet (alexeyab's version) but having issues with it

learning rate in torch.optim.AdamW has no effect?

Optimal hyperparameters for fine tuning LLM

Optimal Learning Rate and Batch Size for LLM Training

How to print learning rate per epoch with pytorch lightning?

tfjs: Adam learning rate decay

PyTorch Lightning's ReduceLRonPlateau not working properly

How to fix the learning-rate for Huggingface´s Trainer?

Unusual Learning Rate Finder Curve: Loss Lowest at Smallest Learning Rate

Pytorch Lightning Learning Rate Tuners Giving unexpected results

Why do we multiply learning rate by gradient accumulation steps in PyTorch?

Getting rid of the clutter of `.lr_find_` in pytorch lightning?

how MultiStepLR works in PyTorch

Hot Network Questions