While load images from image_dataset_from_directory, I’m afraid the image file and label is not matching

I’m using image_dataset_from_directory method to load images from file. And I’ve prepared an label.csv file for each image filename.

However, after my first training, all of the prediction makes predict for same class. I’d checked up the directory folder and don’t know why.

I want to check up the label and image pair to see if any bug, but I don’t find a simple way to do this, so I combine the other custom image_dataset_from_directory function that returns image_path, and compare with the order of given label:

def image_dataset_from_directory(directory, labels='inferred', label_mode='int', class_names=None, color_mode='rgb', batch_size=32, image_size=(256, 256), shuffle=True, seed=None, validation_split=None, subset=None, interpolation='bilinear', follow_links=False): if labels != 'inferred': if not isinstance(labels, (list, tuple)): raise ValueError( '`labels` argument should be a list/tuple of integer labels, of ' 'the same size as the number of image files in the target ' 'directory. If you wish to infer the labels from the subdirectory ' 'names in the target directory, pass `labels="inferred"`. ' 'If you wish to get a dataset that only contains images ' '(no labels), pass `label_mode=None`.') if class_names: raise ValueError('You can only pass `class_names` if the labels are ' 'inferred from the subdirectory names in the target ' 'directory (`labels="inferred"`).') if label_mode not in {'int', 'categorical', 'binary', None}: raise ValueError( '`label_mode` argument must be one of "int", "categorical", "binary", ' 'or None. Received: %s' % (label_mode,)) if color_mode == 'rgb': num_channels = 3 elif color_mode == 'rgba': num_channels = 4 elif color_mode == 'grayscale': num_channels = 1 else: raise ValueError( '`color_mode` must be one of {"rbg", "rgba", "grayscale"}. ' 'Received: %s' % (color_mode,)) interpolation = image_preprocessing.get_interpolation(interpolation) dataset_utils.check_validation_split_arg( validation_split, subset, shuffle, seed) if seed is None: seed = np.random.randint(1e6) image_paths, labels, class_names = dataset_utils.index_directory( directory, labels, formats=WHITELIST_FORMATS, class_names=class_names, shuffle=shuffle, seed=seed, follow_links=follow_links) if label_mode == 'binary' and len(class_names) != 2: raise ValueError( 'When passing `label_mode="binary", there must exactly 2 classes. ' 'Found the following classes: %s' % (class_names,)) image_paths, labels = dataset_utils.get_training_or_validation_split( image_paths, labels, validation_split, subset) dataset = paths_and_labels_to_dataset( image_paths=image_paths, image_size=image_size, num_channels=num_channels, labels=labels, label_mode=label_mode, num_classes=len(class_names), interpolation=interpolation) if shuffle: # Shuffle locally at each iteration dataset = dataset.shuffle(buffer_size=batch_size * 8, seed=seed) dataset = dataset.batch(batch_size) # Users may need to reference `class_names`. dataset.class_names = class_names return dataset, image_paths 

And this is data loader:

train_dataset, train_path = image_dataset_from_directory( img_path, labels = labels, label_mode = 'binary', validation_split = 0.2, color_mode = 'rgb', subset = 'training', image_size = (IMAGE_WIDTH, IMAGE_HEIGHT), batch_size = BATCH_SIZE, seed = 123 ) 

While I print the image and label pair, it’s not matching:

for i in range(len(train_path)): filename = train_path[i].split('/')[-1] print('File: {} label: {}'.format(filename, labels[i])) It returns: File: class_1_img.png label: 1 File: class_1_img.png label: 2 ... 

The unmatching image-label pair make the training meaningless.

How could I load image from directory and give label list to make them match order?

submitted by /u/Laurence-Lin
[visit reddit] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *