My RNN works fine on short sequences.
When I give it longer inputs, predictions become random.
Loss increases with sequence length.
It feels like the model forgets earlier information.
Why does my RNN produce very unstable predictions for longer sequences?
Anushrita GhoshBegginer
This happens because standard RNNs suffer from vanishing and exploding gradients on long sequences.
As the sequence grows, important signals either fade out or blow up, making learning unstable. That is why LSTM and GRU were created.
Switch to LSTM or GRU layers and use gradient clipping:
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
Common mistakes:
Using vanilla RNNs for long text
Not clipping gradients
Too long sequences without truncation
The practical takeaway is that plain RNNs are not designed for long-term memory.