I upgraded to a GPU with much more VRAM.I increased the batch size to use the available memory.Now the training is noticeably slower per epoch.There are no errors, but performance feels worse than before.
Decode Trail Latest Questions
The base model worked well before.After fine-tuning on new data, accuracy drops everywhere.Even old categories are misclassified.The model seems to have forgotten what it knew.
I trained an object detection model on a mixed dataset containing people, vehicles, and small objects like phones and traffic signs.The model detects large objects such as cars and people very reliably.However, it almost completely ignores smaller objects, ...
My CNN reaches over 95% accuracy on the training set.But on the test set it drops below 40%.The data comes from the same source.I feel the model is memorizing instead of learning.
I added thousands of new user interactions to my training dataset.Instead of improving, the recommendation quality dropped.Users are now getting irrelevant suggestions.It feels like more data made the model less accurate.
I trained an LSTM for next-word prediction on text data.The training loss decreases normally.But when I generate text, it repeats the same token again and again.It feels like the model is ignoring the sentence.
I fine-tuned a Transformer model without any memory issues.But when I call model.generate(), CUDA runs out of memory.This happens even for short prompts.Training worked fine, so this feels confusing.
The model produces grammatically correct text.But it keeps repeating the same phrases.The output never moves forward.It feels stuck in a loop?