Vanishing gradient problem

vanishing gradient, Vanishing gradients, на сайте с June 16, 2023 22:17
This occurs when the gradient is too small. As we move backwards during backpropagation, the gradient continues to become smaller, causing the earlier layers in the network to learn more slowly than later layers. When this happens, the weight parameters update until they become insignificant—i.e. 0—resulting in an algorithm that is no longer learning.