multiclass image classification keras

In the previous blog, we discussed the binary classification problem where each image can contain only one class out of two classes. So, in this blog, we will extend this to the multi-class classification problem. In multi-class problem, we classify each image into one of three or more classes. So, let’s get started.

Here, we will use the CIFAR-10 dataset, developed by the Canadian Institute for Advanced Research (CIFAR). The CIFAR-10 dataset consists of 60000 (32×32) color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The classes are completely mutually exclusive. Below are the classes in the dataset, as well as 10 random images from each class.

1. Load the Data

CIFAR-10 dataset can be downloaded by using any of the two methods:

Using Keras builtin datasets
From the official website

Method-1

Downloading using the Keras builtin datasets is pretty straightforward and simple. It’s already transformed into the shape appropriate for the CNN input. No headache, just write one line of code and you are done.

(train_x, train_y), (X_test, y_test) = cifar10.load_data()

1	(train_x, train_y), (X_test, y_test) = cifar10.load_data()

Method-2

The data can also be downloaded from the official website. But the only thing is that it is not in the standard format that can be inputted directly to the model. Let’s see how the dataset is arranged.

The dataset is broken into 5 files so as to prevent your machine from running out of memory. Each file contains a dictionary of data and the corresponding labels. Data is a 10000×3072 array where 10000 is the number of images and 3072 are the pixel values in row-major order. So, the first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. You need to convert it into a (32,32) color image.

Steps:

First, unpickle all the train and test files
Then convert the image format to (width x height x num_channel)

# Code for unpickling the train and test files
def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

# Unpickle all the train and test files
train1 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_1')
train2 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_2')
train3 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_3')
train4 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_4')
train5 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_5')

# Code for unpickling the train and test files

def unpickle(file):

import pickle

with open(file, 'rb') as fo:

dict = pickle.load(fo, encoding='bytes')

return dict

# Unpickle all the train and test files

train1 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_1')

train2 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_2')

train3 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_3')

train4 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_4')

train5 = unpickle('D:/downloads/CV/cifar10/cifar-10-batches-py/data_batch_5')

Then append all the unpickled train files into one array.

train_x = np.concatenate((train1[b'data'], train2[b'data'], train3[b'data'], train4[b'data'], train5[b'data']),axis=0)
train_y = np.concatenate((train1[b'labels'], train2[b'labels'], train3[b'labels'], train4[b'labels'], train5[b'labels']),axis=0)

# convert the image format to (width x height x num_channel)
b = np.reshape(train_x,(50000,3,32,32))
train_x = np.transpose(b,(0,2,3,1))

# Normalize the data between 0 and 1
train_x = train_x.astype('float32')/255
train_y = np.expand_dims(train_y,axis=-1)

train_x = np.concatenate((train1[b'data'], train2[b'data'], train3[b'data'], train4[b'data'], train5[b'data']),axis=0)

train_y = np.concatenate((train1[b'labels'], train2[b'labels'], train3[b'labels'], train4[b'labels'], train5[b'labels']),axis=0)

# convert the image format to (width x height x num_channel)

b = np.reshape(train_x,(50000,3,32,32))

train_x = np.transpose(b,(0,2,3,1))

# Normalize the data between 0 and 1

train_x = train_x.astype('float32')/255

train_y = np.expand_dims(train_y,axis=-1)

Split the data into train and validation

Because the training data contains images in the random order thus simple splitting will be sufficient. Another way is to take some % of images from each of the 5 train files to constitute a validation set.

x_train = train_x[:45000]
y_train = train_y[:45000]
val_x = train_x[45000:]
val_y = train_y[45000:]

x_train = train_x[:45000]

y_train = train_y[:45000]

val_x = train_x[45000:]

val_y = train_y[45000:]

To make sure that this splitting leads to the uniform proportion of examples for each class, we can plot the counts of each class in the validation dataset. Below is the bar plot. Looks like all the classes are uniformly distributed in the validation set.

Model Architecture

Since the images contain a diverse amount of information, we will be needing a bigger network. Bigger the network more will be the chances of overfitting, So, to prevent this we may need to apply some regularization techniques.

model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))
model.add(BatchNormalization())
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D((2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D((2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPool2D((2, 2)))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.compile(loss='sparse_categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])

model = Sequential()

model.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32, 32, 3)))