Keras ImageDataGenerator Normalization at validation and test time

Note: This blog should not be confused with Test time augmentation (TTA).

In the previous blogs, we discussed different operations that are available for image augmentation under the ImageDataGenerator class. For instance rotation, translation, zoom, shearing, normalization, etc. By this, our model will be exposed to more aspects of data and thus will generalize better.

But what about validation and prediction time? Since both of these are used to evaluate the model, we want them to be fixed. That is why we don’t apply any random transformation to the validation and test data. But the test and the dev sets should come from the same distribution as the train set. In other words, the test and the dev sets should be normalized using the statistics calculated on the train set.

Since the normalization in Keras is done using the ImageDataGenerator class. So, in this blog, we will discuss how to normalize the data during prediction using the ImageDataGenerator class?

Method-1

We create a separate ImageDataGenerator instance and then fit it on the train data as shown below.

train_datagen = ImageDataGenerator(featurewise_center=True,
                                  featurewise_std_normalization=True,
                                  rotation_range=40,
                                  width_shift_range=0.2,
                                  zoom_range=0.2,
                                  horizontal_flip=True)

# Fit the train_datagen to calculate the train data statistics
train_datagen.fit(x_train)

# Create a separate ImageDataGenerator instance
validation_datagen = ImageDataGenerator(featurewise_center=True,
                                        featurewise_std_normalization=True)

# Fit the validation_datagen on the train data 
validation_datagen.fit(x_train)

# Use any of the flow methods. Here I have used flow()
training_generator = train_datagen.flow(x_train, y_train, batch_size=64, seed=7)
validation_generator = validation_datagen.flow(x_val, y_val, batch_size=64, seed=7)

history = model.fit_generator(training_generator,steps_per_epoch=64, epochs=10, validation_data=validation_generator, validation_steps=32)

train_datagen = ImageDataGenerator(featurewise_center=True,

featurewise_std_normalization=True,

rotation_range=40,

width_shift_range=0.2,

zoom_range=0.2,

horizontal_flip=True)

# Fit the train_datagen to calculate the train data statistics

train_datagen.fit(x_train)

# Create a separate ImageDataGenerator instance

validation_datagen = ImageDataGenerator(featurewise_center=True,

featurewise_std_normalization=True)

# Fit the validation_datagen on the train data

validation_datagen.fit(x_train)

# Use any of the flow methods. Here I have used flow()

training_generator = train_datagen.flow(x_train, y_train, batch_size=64, seed=7)

validation_generator = validation_datagen.flow(x_val, y_val, batch_size=64, seed=7)

history = model.fit_generator(training_generator,steps_per_epoch=64, epochs=10, validation_data=validation_generator, validation_steps=32)

Similarly, we can do this for the test set. Because for validation and test set we need to fit the generator on the train data, this is very time-consuming.

Method-2

We use the “standardize” method provided under the ImageDataGenerator class. As already discussed, the “standardize” method performs in-place normalization to the batch of inputs, which makes it perfect for this work. You can read more about normalization here.

train_datagen = ImageDataGenerator(featurewise_center=True,
                                  featurewise_std_normalization=True,
                                  rotation_range=40,
                                  width_shift_range=0.2,
                                  zoom_range=0.2,
                                  horizontal_flip=True)

# Fit the train_datagen to calculate the train data statistics
train_datagen.fit(x_train)

# Apply standardize method
train_datagen.standardize(x_val)

training_generator = train_datagen.flow(x_train, y_train, batch_size=64, seed=7)

history = model.fit_generator(training_generator,steps_per_epoch=64, epochs=10, validation_data=(x_val,y_val))

train_datagen = ImageDataGenerator(featurewise_center=True,

featurewise_std_normalization=True,

rotation_range=40,

width_shift_range=0.2,

zoom_range=0.2,

horizontal_flip=True)

# Fit the train_datagen to calculate the train data statistics

train_datagen.fit(x_train)

# Apply standardize method

train_datagen.standardize(x_val)

training_generator = train_datagen.flow(x_train, y_train, batch_size=64, seed=7)

history = model.fit_generator(training_generator,steps_per_epoch=64, epochs=10, validation_data=(x_val,y_val))

Method-3

This is similar to the above method but is more explicit. In this we obtain the mean and the standard deviation from the generator and apply the desired normalization.

train_datagen = ImageDataGenerator(featurewise_center=True,
                                  featurewise_std_normalization=True,
                                  rotation_range=40,
                                  width_shift_range=0.2,
                                  zoom_range=0.2,
                                  horizontal_flip=True)

# Fit the train_datagen to calculate the train data statistics
train_datagen.fit(x_train)

# Apply the desired normalization
x_val -= train_datagen.mean
x_val /= train_datagen.std

training_generator = train_datagen.flow(x_train, y_train, batch_size=64, seed=7)

history = model.fit_generator(training_generator,steps_per_epoch=64, epochs=10, validation_data=(x_val,y_val))

train_datagen = ImageDataGenerator(featurewise_center=True,

featurewise_std_normalization=True,

rotation_range=40,

width_shift_range=0.2,

zoom_range=0.2,

horizontal_flip=True)

# Fit the train_datagen to calculate the train data statistics

train_datagen.fit(x_train)

# Apply the desired normalization

x_val -= train_datagen.mean

x_val /= train_datagen.std

training_generator = train_datagen.flow(x_train, y_train, batch_size=64, seed=7)

history = model.fit_generator(training_generator,steps_per_epoch=64, epochs=10, validation_data=(x_val,y_val))

I hope you might have now get some idea of how to apply normalization during prediction time. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

0 Shares

TheAILearner

Mastering Artificial Intelligence

Keras ImageDataGenerator Normalization at validation and test time

Method-1

Method-2

Method-3

Leave a Reply Cancel reply