My model recognizes actions well in static camera videos.
When the camera pans or shakes, predictions become unstable.
The action is the same.
Only the camera motion changes.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
This happens because the model confuses camera motion with object motion. Without training on moving-camera data, it treats global motion as part of the action.
Neural networks do not automatically separate camera movement from object movement. They must be shown examples where these effects differ.
Using optical flow, stabilization, or training with diverse camera motions improves robustness. The practical takeaway is that motion context matters as much as visual content.