Firstly, I began to train the network with around 400 hundred images for 50k steps. Then, I decided to continue with the training with a new dataset with the same classes, but increased the number of steps to 110k steps; 2 more data augmentation options; dropout set to true and increased batch size from 32 to 64. It started with these loss values: loss/localization loss=1.148414 Loss/regularization loss=3695957000.0 Loss/ classification loss=508.7694 Loss/total loss=3695957500.0
Several hundred steps have passed and the losses seem to be decreasing.
Should I be worried about it starting with such high loss?