transformers
Home/transformers
Decode Trail Latest Questions
Asked: December 22, 2025In: Deep Learning
I fine-tuned a Transformer model without any memory issues.But when I call model.generate(), CUDA runs out of memory.This happens even for short prompts.Training worked fine, so this feels confusing.
Asked: December 14, 2025In: Deep Learning
The training loss drops steadily during fine-tuning.But the translated sentences are grammatically wrong.BLEU and other quality metrics do not improve.It feels like the model is optimizing the wrong thing.