ImageDataGenerator | TheAILearner

In the previous blog, we have discussed how to apply different transformations to augment data using Keras ImageDataGenerator class. In this blog, we will learn how we can generate batches of the augmented data. This is done using the flow method which creates an iterator. We can easily iterate over the iterator to yield the batches of data. Let’s first discuss Keras ImageDataGenerator- flow method API and then we will see how to use this.

Keras API

flow(x, y=None, batch_size=32, shuffle=True, sample_weight=None, seed=None, save_to_dir=None, save_prefix='', save_format='png', subset=None)

1	flow(x, y=None, batch_size=32, shuffle=True, sample_weight=None, seed=None, save_to_dir=None, save_prefix='', save_format='png', subset=None)

Here, x is the Numpy array of rank 4 (batches, image_width, image_height, channels) and y is the corresponding labels. For greyscale image, channels must be equal to 1.

One can also save the augmented images to the disk by specifying the “save_to_dir” argument. You can also select which format to save the image files and what prefix to use, using the “save_format” and “save_prefix” arguments respectively.

For instance, the below code saves the augmented file to the downloads folder with the name as “aug_0_2345” etc.

data_generator = datagen.flow(img, save_to_dir='D:/downloads/', save_format='jpeg', save_prefix='aug')

1	data_generator = datagen.flow(img, save_to_dir='D:/downloads/', save_format='jpeg', save_prefix='aug')

Another interesting thing is that one can weight each sample using the “sample_weight” argument. Now, while calculating the loss each sample has its own weight which controls the gradient direction. This should have the same length as the input array. These sample_weights, if not None, are returned as it is.

“subset” decides whether the data generated is for training or validation. This works as follows:

First of all, depending on the input length and validation_split argument in the ImageDataGenerator, the split index is determined as shown

split_idx = int(len(x) * image_data_generator._validation_split)

1	split_idx = int(len(x) * image_data_generator._validation_split)

Now, if subset is ‘validation’, then the data is splitted as

x = x[:split_idx]

1	x = x[:split_idx]

Rest of the data is reserved for the training. As we can see that splitting is straight i.e. it reserves first n examples for validation and rest for training. So, training and validation may have a different number of classes after the split, if the data is not properly shuffled.

Note for the test set, set shuffle equal to False. Set the batch size carefully for the test set. Make sure that this divides exactly the test set as you don’t want to leave some examples or predict multiple times some examples.

Now, you might have got some idea about the flow method arguments. Next, let’s see how this method works.

How the flow method works?

Firstly, this generates random parameters for a transformation using the “get_random_transform” method.
Then these transformations are applied using the “apply_transform” method.
Finally, the image is standardized using the “standardize” method.

How to use?

Let’s take MNIST digits classification example. Firstly load the required libraries and the data.

1. Load Libraries and Data

from keras.layers import Dense, Flatten, Conv2D, MaxPool2D
from keras.models import Sequential
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = np.expand_dims(x_train, axis=-1)

from keras.layers import Dense, Flatten, Conv2D, MaxPool2D

from keras.models import Sequential

from keras.datasets import mnist

import numpy as np

import matplotlib.pyplot as plt

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = np.expand_dims(x_train, axis=-1)

2. Build model

model = Sequential()
model.add(Conv2D(32,(3,3),activation='relu',input_shape=(28,28,1)))
model.add(MaxPool2D((2,2)))
model.add(Conv2D(64,(3,3),activation='relu'))
model.add(MaxPool2D((2,2)))
model.add(Flatten())
model.add(Dense(512,activation='relu'))
model.add(Dense(10,activation='softmax'))

model.compile(loss='sparse_categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

model = Sequential()

model.add(Conv2D(32,(3,3),activation='relu',input_shape=(28,28,1)))

model.add(MaxPool2D((2,2)))

model.add(Conv2D(64,(3,3),activation='relu'))

model.add(MaxPool2D((2,2)))

model.add(Flatten())

model.add(Dense(512,activation='relu'))

model.add(Dense(10,activation='softmax'))

model.compile(loss='sparse_categorical_crossentropy', optimizer='Adam', metrics=['accuracy'])

3. Data Augmentation

Create an ImageDataGenerator instance with the set of transformations you want to perform. If you were to perform augmentation using transformation such as rotation, cropping, etc. better create a separate generator for the validation set. Because validation data should be kept fixed. In that case, don’t use the validation_split argument. Instead, use some other methods for splitting, for instance, train_test_split, etc.

datagen = ImageDataGenerator(rescale=1/255.,validation_split=0.2)

1	datagen = ImageDataGenerator(rescale=1/255.,validation_split=0.2)

4. flow method

Based on the validation split argument in the above code, we create a separate training and validation generator using the “subset” argument.

training_generator = datagen.flow(x_train, y_train, batch_size=64,subset='training',seed=7)
validation_generator = datagen.flow(x_train, y_train, batch_size=64,subset='validation',seed=7)

1 2	training_generator = datagen.flow(x_train, y_train, batch_size=64,subset='training',seed=7) validation_generator = datagen.flow(x_train, y_train, batch_size=64,subset='validation',seed=7)

5. Visualize the training generator

Let’s plot the first outcome of 6 batches.

plt.figure(figsize=(10,5))
for i in range(6):
    plt.subplot(2,3,i+1)
    for x,y in training_generator:
        plt.imshow((x[0]/255).reshape(28,28),cmap='gray')
        plt.title('y={}'.format(y[0]))
        plt.axis('off')
        break
plt.tight_layout()
plt.show()

plt.figure(figsize=(10,5))

for i in range(6):

plt.subplot(2,3,i+1)

for x,y in training_generator:

plt.imshow((x[0]/255).reshape(28,28),cmap='gray')

plt.title('y={}'.format(y[0]))

plt.axis('off')

break

plt.tight_layout()

plt.show()

6. Train model

history = model.fit_generator(training_generator,steps_per_epoch=(len(x_train)*0.8)//64, epochs=10, validation_data=validation_generator, validation_steps=(len(x_train)*0.2)//64)

1	history = model.fit_generator(training_generator,steps_per_epoch=(len(x_train)0.8)//64, epochs=10, validation_data=validation_generator, validation_steps=(len(x_train)0.2)//64)

Similarly, you can create the test generator and evaluate the performance of the model on the test set. This is how you can use the flow method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

One of the methods to prevent overfitting is to have more data. By this, our model will be exposed to more aspects of data and thus will generalize better. To get more data, either you manually collect data or generate data from the existing data by applying some transformations. The latter method is known as Data Augmentation.

In this blog, we will learn how we can perform data augmentation using Keras ImageDataGenerator class. First, we will discuss keras image augmentation API and then we will learn how to use this.

Keras API

ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False, samplewise_std_normalization=False, zca_whitening=False, zca_epsilon=1e-06, rotation_range=0, width_shift_range=0.0, height_shift_range=0.0, brightness_range=None, shear_range=0.0, zoom_range=0.0, channel_shift_range=0.0, fill_mode='nearest', cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None, data_format=None, validation_split=0.0, dtype=None)

ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False, samplewise_std_normalization=False, zca_whitening=False, zca_epsilon=1e-06, rotation_range=0, width_shift_range=0.0, height_shift_range=0.0, brightness_range=None, shear_range=0.0, zoom_range=0.0, channel_shift_range=0.0, fill_mode='nearest', cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None, data_format=None, validation_split=0.0, dtype=None)

Let’s understand each of its arguments in detail using the following image

featurewise_center: Feature-wise means of the entire dataset. So, in this, we first calculate the mean over the entire dataset and then subtract this mean from each image. So, this results in shifting the mean of the distribution close to zero. To calculate the mean, you need to fit the data generator to the training data as

datagen = ImageDataGenerator(featurewise_center=True)
datagen.fit(x_train)

1 2	datagen = ImageDataGenerator(featurewise_center=True) datagen.fit(x_train)

For this, you have to load the entire training dataset which may significantly kill your memory if the dataset is large. To prevent this, one can calculate the mean from a smaller sample.

featurewise_std_normalization: In this, we divide each image by the standard deviation of the entire dataset. Thus, featurewise center and std_normalization together known as standardization tends to make the mean of the data to be 0 and std. deviation of 1 or in short Gaussian Distribution.

samplewise_center: Sample-wise means of a single image. So, in this, we set the mean pixel value of each image to be zero. Since the image mean is a local statistic that can be calculated from the image itself, there is no need for calling the fit method.

samplewise_std_normalization: In this, we divide each input image by its standard deviation.

zca_whitening: This is a preprocessing method which tries to remove the redundancy from the data while keeping its structure intact, unlike PCA. In short, this strengthens the high-frequency components in the image. For maths behind this, refer to this StackOverflow question. You need to fit the training data to calculate the principal components. This should be used with featurewise_center=True, otherwise, this will give you a warning and automatically set featurewise_center=True.

Note: For featurewise_center, featurewise_std_normalization, zca_whitening, one must fit the data to calculate the mean, standard deviation, and principal components.

rotation_range: This rotates each image up to the angle specified. Below figure shows the rotations by 45 degrees

width_shift_range: This results in shifting the image in the horizontal direction.

If it is a float less than 1, then this shifts the image by that fraction of width. For instance, 0.2 means shift horizontally by 20% of the image width.
If it is integer >=1, then this shifts the image horizontally by pixels in the range [-num, num]. For instance, 3 means shift horizontally by the pixels selected from the range [-2,-1,0,1,2]. So, the image may be shifted by 2 or 1 or 0 pixels.
Similarly for a 1D array.

height_shift_range: Similar to width_shift_range but in the vertical direction.

brightness_range: This produces images similar to as taken with different lighting conditions. In this, you pass the min and the max range based on which the image will be darkened or brightened. Values <1 darkens the image, >1 brightens the image and =1 means no change. For example, below line darkens the image as shown

datagen = ImageDataGenerator(brightness_range=[0.2,0.8])

1	datagen = ImageDataGenerator(brightness_range=[0.2,0.8])

rescale: This is to normalize the pixel values to a specific range. For 8-bit image, we generally rescale by 1/255 so as to have pixel values in the range 0 and 1.

shear_range: This is the shear angle in the counter-clockwise direction in degrees.

zoom_range: This zooms the image. If passed as float then [lower, upper] = [1-zoom_range, 1+zoom_range]. For instance, 0.2 means zoom in the range [0.8, 1.2]. Can also be passed a list directly.

channel_shift_range: This randomly shifts the values of the channels by the values specified. The below code sums up what this actually does.

[np.clip(x_channel + np.random.uniform(-value, value), min_img, max_img) for x_channel in img]

1	[np.clip(x_channel + np.random.uniform(-value, value), min_img, max_img) for x_channel in img]

Add random values to channel and then clipping depending on the max and min of the image.

horizontal_flip and vertical flip: Randomly flips the input image in the horizontal and vertical directions respectively.

data_format: Either channels_first or channels_last (default).

preprocessing_function: This function is applied to each input after the augmentation step. Below is an example of one such function where images are blurred

def blur(img):
    return (cv2.blur(img,(5,5)))

datagen = ImageDataGenerator(preprocessing_function= blur)

def blur(img):

return (cv2.blur(img,(5,5)))

datagen = ImageDataGenerator(preprocessing_function= blur)

How to use this?

Below is the code using which I have generated the above images

import numpy as np
import matplotlib.pyplot as plt
import keras
from keras.preprocessing.image import load_img, ImageDataGenerator, img_to_array

# Load the image and change it into an array and expand the dimensions
img = load_img('D:/downloads/opencv_logo.PNG')
img = img_to_array(img)
img1 = np.expand_dims(img, axis=0)

# create an instance of the class with the desired operation
datagen = ImageDataGenerator(horizontal_flip=True)

# Depending on the augmentation method you may need to call
# fit method to calculate the global statistics
data_generator = datagen.flow(img1,batch_size=1)

# Display some augmented samples
plt.figure(figsize=(10,5))
for i in range(6):
    plt.subplot(2,3,i+1)
    for x in data_generator:
        plt.imshow(x[0]/255.)
        plt.xticks([])
        plt.yticks([])
        break
plt.tight_layout()
plt.show()

import numpy as np

import matplotlib.pyplot as plt

import keras

from keras.preprocessing.image import load_img, ImageDataGenerator, img_to_array

# Load the image and change it into an array and expand the dimensions

img = load_img('D:/downloads/opencv_logo.PNG')

img = img_to_array(img)

img1 = np.expand_dims(img, axis=0)

# create an instance of the class with the desired operation

datagen = ImageDataGenerator(horizontal_flip=True)

# Depending on the augmentation method you may need to call

# fit method to calculate the global statistics

data_generator = datagen.flow(img1,batch_size=1)

# Display some augmented samples

plt.figure(figsize=(10,5))

for i in range(6):

plt.subplot(2,3,i+1)

for x in data_generator:

plt.imshow(x[0]/255.)

plt.xticks([])

plt.yticks([])

break

plt.tight_layout()

plt.show()

This way you can create augmented examples. In the next blog, we will discuss how to generate batches of augmented data using the flow method.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

TheAILearner

Mastering Artificial Intelligence

Tag Archives: ImageDataGenerator

ImageDataGenerator – flow method

Keras API

How the flow method works?

How to use?

1. Load Libraries and Data

2. Build model

3. Data Augmentation

4. flow method

5. Visualize the training generator

6. Train model

Data Augmentation with Keras ImageDataGenerator

Keras API

How to use this?