Timeline for answer to How to do gradient clipping in pytorch? by Rahul

8 events

when toggle format	what		by	license	comment
May 27, 2022 at 20:49	comment	added	russian_spy		For "args.clip" you can use 0.01; e.g., torch.nn.utils.clip_grad_norm_(model.parameters(), 0.01)
Mar 29, 2022 at 2:49	history	edited	Mateen Ulhaq	CC BY-SA 4.0	Move link inline.
Jan 28, 2022 at 6:45	comment	added	vdi		@FarhangAmaji the `max_norm` (clipping threshold) value from the `args` (perhaps from `argparse` module)
Jan 21, 2022 at 20:02	comment	added	Charlie Parker		does it matter if you call `opt.zero_grad()` before the forward pass or not? My guess is that the sooner it's zeroed out perhaps the sooner MEM freeing happens?
Dec 3, 2021 at 11:45	comment	added	Farhang Amaji		what is args.clip?
Oct 29, 2020 at 15:33	comment	added	Rahul		This simply follows a popular pattern, where one can insert torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip) between the loss.backward() and optimizer.step()
Oct 28, 2020 at 11:26	comment	added	Gulzar		Why is this more complete? I see the more votes, but don't really understand why this is better. Can you explain please?
May 10, 2019 at 1:12	history	answered	Rahul	CC BY-SA 4.0