Hello,I am training a segmentation model following this link. My problem at hand is a binary segmentation 0 for background class and 1 for object of interest. I noticed the predictions are good when Sparse Categorical CE loss is used compared to BCE. Can anyone give a valid reason to why this is the case? I am under an impression that BCE suits better for binary segmentation tasks. Can you please also recommend the best metric instead of accuracy? I am currently using mean IoU as a metric for evaluation.