I am using the Object detection API, i did everything in the EXACT way the procedure is described on this site: tensorflow-object-detection-api
I am using this model: SSD ResNet50 V1 FPN 640×640 (RetinaNet50) from the Model Zoo
I am running my training on a 1070 Ti with about 8Gb of VRAM and 6,5 are available. Now i am getting this error, when i use a batch size greater than 2
2022-01-24 23:28:40.444781: E tensorflow/stream_executor/cuda/cuda_driver.cc:802] failed to alloc 4294967296 bytes on host: CUDA_ERROR_OUT_OF_MEMORY: out of memory
For me this looks like it is trying to allocate only 4294967296 byte and i have 8589900000 byte available. So im only trying to allocate about 50%. nvidia-smi shows im using 7488MiB/8192MiB of VRAM during training(batchsize = 1). And 14,6 /16GB of RAM.
Obviously training with a batch size of 1 is useless, 8 is just doable it seems, but i dont understand why? Most people say a batch size of 64 should be possible with my hardware, please correct me.