Skip to main content
0 votes
1 answer
69 views

If you use cosine decay for example and you have starting learning rate and final learning rate, can you tune those hyperparameters so that final learning rate is in some ratio of starting learning ...
ict's user avatar
  • 1
1 vote
1 answer
200 views

I'm trying to update the learning rate of my Keras model dynamically during training. I'm using the following code: import tensorflow as tf from tensorflow.keras import backend as K model = keras....
codebysumit's user avatar
1 vote
0 answers
23 views

i have been trying to implement a reducelronplateau scheduler which will allow tracking of loss history, and automatically update the learning rate whenever a loss plateau is detected. i have made ...
ririririri's user avatar
2 votes
1 answer
2k views

I am working on fine-tuning BLIP-2 on the RSICD dataset using LoRA. I am working on colab, using an A100. I am strangely finding that when I set the learning rate in the code below, it has no effect. ...
Paul's user avatar
  • 1,216
1 vote
0 answers
576 views

could I ask you for help? I am doing fine tuning of LLM model Llama3 8b (with LoRA) for text classification. I am using Trainer from Huggingface. I am looking for the optimal ...
Roman Frič's user avatar
1 vote
0 answers
260 views

What are the best practices for optimizing batch size and learning rate in training Large Language Models (LLMs)? How should these hyperparameters be adjusted relative to each other for efficient ...
Arman Asgharpoor's user avatar
3 votes
1 answer
927 views

I am having a problem with printing (logging) learning rate per epoch in pytorch lightning (PL). TensorFlow logs the learning rate at default. As PL guide suggested, I wrote the following code: class ...
Tae-Sung Shin's user avatar
2 votes
0 answers
76 views

How do I train a Tensorflow model using Adam optimizer that decays learning rate during the trining with Tensorflow.JS! (not python) I cannot find that the library provides an exponential decay ...
Oleg K's user avatar
  • 5,326
1 vote
1 answer
999 views

I have been trying to write a lightning module using both a warmup and an annealing function ReduceLROnPlateau and something really odd is happening. If the program reduces the learning rate, the ...
GZinn's user avatar
  • 11
1 vote
3 answers
6k views

I'm training model with the following parameters: Seq2SeqTrainingArguments( output_dir = "./out", overwrite_output_dir = True, do_train ...
user3668129's user avatar
  • 4,912
2 votes
1 answer
551 views

I'm using PyTorch Lightning's LR Finder but am getting an atypical curve. The loss starts at its lowest point when the learning rate is at its smallest, increases until it plateaus, and then exhibits ...
keving's user avatar
  • 31
0 votes
2 answers
1k views

I'm trying to find an optimal learning rate using python pl.tuner.Tuner but results aren't as expected The model I am running is a linear classifier on top of a BertForSequenceClassification Automodel ...
Toby 's user avatar
5 votes
1 answer
5k views

Loss functions in pytorch use "mean" reduction. So it means that the model gradient will have roughly the same magnitude given any batch size. It makes sense that you want to scale the ...
offchan's user avatar
  • 4,285
1 vote
1 answer
476 views

When using the Lightning’s built-in LR finder: # Create a Tuner tuner = Tuner(trainer) # finds learning rate automatically # sets hparams.lr or hparams.learning_rate to that learning rate tuner....
Gabi Gubu's user avatar
2 votes
1 answer
3k views

I'm new to PyTorch and am working on a toy example to understand how weight decay works in learning rate passed into the optimizer. When I use MultiStepLR , I was expecting to decrease the learning ...
whitepanda's user avatar

15 30 50 per page
1
2 3 4 5 6