Sometimes I run into a problem:
OOM when allocating tensor with shape
OOM when allocating tensor with shape (1024, 100, 160)
Where 1024 is my batch size and I don't know what's the rest. If I reduce the batch size or the number of neurons in the model, it runs fine.
Is there a generic way to calculate optimal batch size based on model and GPU memory, so the program doesn't crash?
Since my question might seem unclear, let me put it his way: I want the largest batch size possible in terms of my model, which will fit into my GPU memory and won't crash the program.
To whoever voted for closing the question for being too broad: How on earth is the question too broad? There is some algorithm which selects a portion of data to put in GPU memory. It clearly is imperfect since the data sometimes exceeds the the GPU memory. Asking for how the algorithm works, in order to prevent random crashes, seems quite reasonable to me.