I upgraded to a GPU with much more VRAM.
I increased the batch size to use the available memory.
Now the training is noticeably slower per epoch.
There are no errors, but performance feels worse than before.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
This happens because increasing GPU memory usually leads people to increase batch size, and large batches change how neural networks learn. While each step processes more data, the model receives fewer gradient updates per epoch, which can slow down learning even if raw computation is faster.
Large batches tend to smooth out gradient noise, which reduces the regularizing effect that smaller batches naturally provide. This often causes the optimizer to take more conservative steps, requiring more epochs to reach the same level of performance. As a result, even though each batch runs faster, the model may need more total training time to converge.
To compensate, you usually need to scale the learning rate upward or use gradient accumulation strategies. Without these adjustments, more GPU memory simply changes the training dynamics instead of making the model better or faster.
Common mistakes:
Increasing batch size without adjusting learning rate
Assuming more VRAM always improves training
Ignoring convergence behavior
The practical takeaway is that GPU memory changes how learning happens, not just how much data you can fit.