I am training a deep network for a regression task.
The loss drops initially but then stops changing.
Even after many epochs it never improves.
The model is clearly underperforming.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
This happens when gradients vanish or the learning rate is too small to make progress.
Deep networks can get stuck in flat regions where weight updates become tiny. This is common when using sigmoid or tanh activations in deep layers.
Switch to ReLU-based activations and use a modern optimizer like Adam:
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
Also verify that your inputs are normalized.
Common mistakes:
Using sigmoid everywhere
Learning rate too low
Unscaled inputs
The practical takeaway is that stagnation usually means gradients cannot move the weights anymore.