We collect logs, but during incidents they don’t answer key questions.Important details seem to be missing or hard to correlate.I’m trying to understand how to make logs more useful!
Decode Trail Latest Questions
Training loss decreases smoothly.Validation loss fluctuates.Regularization is enabled.Still, generalization is poor.
I enabled autoscaling to handle traffic spikes.Instead of improving performance, latency increased.Cold starts seem frequent.This feels counterproductive.