I retrained my model with more recent data.
The assumption was that newer data would improve performance.
Instead, the new version performs worse in production.
This feels counterintuitive and frustrating.
Why does my retrained model perform worse than the previous version?
Sambhavesh PrajapatiBegginer
More recent data does not automatically mean better training data.
If the new dataset contains more noise, label errors, or short-term anomalies, the model may learn unstable patterns. Additionally, changes in class balance or feature availability can negatively affect performance.
Compare the old and new datasets directly. Look at label distributions, missing values, and feature coverage. Evaluate both models on the same fixed holdout dataset to isolate the effect of retraining.
If the model is sensitive to recent trends, consider weighting historical data rather than replacing it entirely. Some systems benefit from gradual updates instead of full retrains. The takeaway is that retraining should be treated as a controlled experiment, not an automatic improvement.