I want to add another step here where after reading the images the dataset is split into the image (X) and the BBox (Y) information as TensorFlow objects for RESNET50. Should I index through the ds_train or there is a better way. Thank you.

