1. Home
2. Questions
3. AI Assist
4. Tags
6. Challenges
7. Chat
8. Articles
9. Users
11. Companies
12. Collectives
13. Communities for your favorite technologies. Explore all Collectives
Stack Internal

Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work.
Try for free Learn more
Stack Internal
Bring the best of human thought and AI automation together at your work. Learn more

Arun Negi

• Mar 1 at 8:44 •

117 views

How initializing weights with large value causes vanishing gradient problem in neural network

Ask Question

I was watching this tutorial on weight initialization in neural network, and im not able to understand this statement:

In case of Tanh, Sigmoid activation, If we initialize weights with large values (range [0,1)), then the training becomes slow and the vanishing gradient problem may arise.

But how is that possible, i thought VGP is due to small values of gradients, which is caused by small weights or small output from activation

Collectives™ on Stack Overflow

How initializing weights with large value causes vanishing gradient problem in neural network