I fine-tuned a Transformer model without any memory issues.But when I call model.generate(), CUDA runs out of memory.This happens even for short prompts.Training worked fine, so this feels confusing.
Home/Sambhav Rathi/Questions
Decode Trail Latest Questions
My GAN generates images but they look washed out.Many samples look almost identical.Training loss looks stable.But the visual quality never improves.
My model recognizes actions well in static camera videos.When the camera pans or shakes, predictions become unstable.The action is the same.Only the camera motion changes.
Asked: January 31, 2025In: Deep Learning
Short sequences work fine.Longer sequences cause GPU crashes.No code changes were made.Only input size increased.