In this blog, we will discuss how to create custom callbacks in Keras. This is actually very simple. You just need to create a class that takes keras.callbacks.Callback() as its base class. The set of methods that we can use is also fixed. We just need to write the logic. Let’s understand this with the help of an example. Here, we will create a callback that stops the training when the accuracy has reached a threshold and prints the message.
1
2
3
4
5
6
7
classmyCallback(keras.callbacks.Callback):
def on_epoch_end(self,epoch,logs={}):
iflogs.get('acc')>0.99:
self.model.stop_training=True
print('Stopped training as accuracy above threshold')
callbacks1=myCallback()
In “Line-1“, we create a class “mycallback” that takes keras.callbacks.Callback() as its base class.
In “Line-2“, we define a method “on_epoch_end”. Note that the name of the functions that we can use is already predefined according to their functionality. For instance, if we define a function by the name “on_epoch_end“, then this function will be implemented at the end of every epoch. If you change this function name to “on_epoch_end1“, then this function will not be implemented.
Below are some of the method names that we can use. The name of these functions is aptly named according to their functionality. The arguments that they can take is already fixed.
The epoch and batch arguments refer to the current epoch and batch number. And “logs” is a dictionary that records all the training events like “loss”, “acc” etc.
In “Line-3,4“, we define a stopping condition and if met stop training the model. Note that we can access the model being trained through the base class. And so we can use any other model properties like save_weights, save, trainable, etc.
At last, we create an instance of this class and pass this instance as a list in the fit() method. Below is the output of applying the above callback.
Below is another example that saves the weights at the end of each epoch.
As we already know that the neural networks suffer greatly from the problem of plateaus which substantially slows down the training process. Thus, the selection of a good learning rate becomes a really challenging task. Many methods have been proposed to counter this problem such as using cyclic learning rates where we vary learning rates between reasonable boundary values, or other methods like Stochastic Gradient Descent with Warm Restarts, etc.
Keras also provides ReduceLROnPlateau callback that reduces the learning rate by some factor whenever the learning stagnates. It is believed that sometimes our model will benefit from lowering the learning rate when trapped in the plateau region. So, let’s discuss its Keras API.
This callback “monitors” a quantity and if no improvement is seen for a ‘patience‘ number of epochs, the learning rate is reduced by the “factor” specified. Improvement is specified by the “min_delta” argument. No improvement is considered if the change in the monitored quantity is less than the min_delta specified. This also has an option whether you want to start evaluating the new LR instantly or give some time to the optimizer to crawl with the new LR and then evaluate the monitored quantity. This is done using the “cooldown” argument. You can also set the lower bound on the LR using the “min_lr” argument. No matter how many epochs or what reduction factor you use, the LR will never decrease beyond “min_lr“.
Now, let’s see how to use this.
1
2
3
4
5
6
7
8
9
# Load data, preprocessing and build model
...
# First, create an instance of this ReduceLROnPlateau class
Keras has provided several builtin classes/callbacks that serves our purpose for most of the cases. But let’s say we want to stop training when the accuracy has reached a benchmark or save the model at each batch. These tasks cannot be achieved using the builtin callbacks. In that case, we need to create our own callback function. In Keras, we can easily create custom callbacks using keras.callbacks.Callback() as our base class. But for that case, you need to create a class and write some amount of code.
As an alternative, Keras also provides us with an option to creates simple, custom callbacks on-the-fly. This can be done with ease using the LambdaCallback. So, in this blog, let’s discuss how to use this callback.
Note: For Python 3.8 or higher, lambda function will start supporting assignment expressions and this will make this callback even more powerful.
Here, all the arguments are aptly named according to their work. For instance, in “on_epoch_end” argument, we pass the function that will be called at the end of each epoch.
Now, all of these arguments expect functions with fixed positional arguments which are mentioned below.
Now, let’s see how to use this callback. Below I created a simple callback that saves the model weights when the accuracy is beyond some limit.
In this blog, we will discuss Keras CSVLogger callback. As clear from the name, this streams the training events like ‘loss’, ‘acc’ etc. to a csv file. Using this you can export all the values that can be represented as a string. So, let’s discuss its Keras API.
Here, the “filename” is the name of the csv file where you want to keep the record. This also gives you an option of how to separate elements in the csv file. You can pass this as a string in the “separator” argument.
This also provides an option of whether to append the training history in an existing file or overwrite the existing file. For instance, if “append=False”, this will overwrite an existing file. Otherwise, it will append the information in the existing file without affecting the previously stored information in that file.
If no existing file is present this will create a new file and then append the information. Now, let’s see how to use this class.
1
2
3
4
5
6
7
8
9
# Load data, preprocessing and build model
...
# First, create an instance of this CSVLogger class
In neural networks, setting up a good learning rate is always a challenging task. If the learning rate is set too high, this can cause undesirable divergent behavior in your loss function or sometimes your model can converge too quickly to a sub-optimal value. If it is set too low, the training process may take a long time. Thus, it often proves sometimes useful to decay the learning rate as the training progresses. This can be done using the Learning rate schedules or the adaptive learning rate methods like SGD, Adam, etc. In this blog, we will only discuss Learning rate schedules.
Learning rate schedules as clear from the name adjusts the learning rates based on some schedule. For instance, time decay, exponential decay, etc. To implement these decays, Keras has provided a callback known as LearningRateScheduler that adjusts the weights based on the decay function provided. So, let’s discuss its Keras API.
Here, the “schedule” is a decay function that takes epoch number and the current learning rate as the input and returns the new learning rate. The verbose argument tells us whether to print the following message when changing the learning rate or not.
Note: This method overwrites the learning rate of the optimizer used.
Now, let’s discuss the schedule argument in more detail. Here, we will use the time decay function. This updates the learning rate by the expression below
Now, let’s see how to use this using the LearningRateScheduler callback. First, create a function that takes epoch and learning rate as arguments as shown below
1
2
3
4
def time_decay(epoch,initial_lrate):
decay_rate=0.01
new_lrate=initial_lrate/(1+decay_rate*epoch)
returnnew_lrate
Then pass this function in the LearningRateScheduler callback as shown below
1
2
from keras.callbacks import LearningRateScheduler
lrate=LearningRateScheduler(time_decay,verbose=1)
Now, simply pass this callback as a list in the fit() method as shown below.
1
record=model.fit(...,callbacks=[lrate],...)
You can also check how the learning rate varies by the following command. This returns a list of learning rates over the epochs.
1
record.history['lr']
You can also plot the learning rate over epochs using any plotting library.
Similarly, you can use other decays like step decay, exponential decay or any custom decay. Just create a function and pass it to the LearningRateScheduler callback. Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.
In this blog, we will discuss Keras ProgbarLogger callback. As clear from the name, this deals with the logging of the progress bar that we usually see during fit() method depending upon the verbosity argument. So, let’s first discuss its API.
Here, “count_mode” argument controls whether the progress bar displays the samples seen or the steps. This argument can take one of the values from ‘samples‘ or ‘steps‘. If set equal to ‘steps’, make sure that you provide the “steps_per_epoch” argument in the fit() method. Otherwise, this will give you an error. The ‘steps’ is basically used with generators like fit_generator, etc.
Below is the figure where first I’ve trained on 5000 samples and the count_mode argument is set to “samples“. For the second one, I’ve used 12 steps in the fit_generator for 1 epoch. The count_mode argument is set to “steps“.
The second argument “stateful_metrics” controls whether to display the average value of the metric specified or display its value at the last step of every epoch. This should be passed as an iterable like list etc. For more details, refer to Keras callbacks BaseLogger where we have discussed this argument in detail.
This callback, in turn, calls the Keras Progbar class that controls how to display the progress bar like the width of the progress bar, its update interval, etc. Now, let’s see how to use this.
1
2
3
4
5
6
7
8
9
# Load data, preprocessing and build model
...
# First, create an instance of this ProgbarLogger class
In this blog, we will discuss Keras TerminateOnNaN callback. As clear from the name, this terminates the training when a Nan loss is encountered. Below is the Keras API for this callback.
1
keras.callbacks.TerminateOnNaN()
This checks the loss at every batch end and if that loss is nan or inf, this callback stops the training. This prints out the batch number at which it stops the training. Something like this will be printed.
Below is the code, taken from Keras that shows how this works.
As we already know that the values of the metrics such as loss, acc, etc. that we get during the training are the averaged values over the epoch. This averaging of the values are automatically applied to every Keras model using the BaseLogger class present under the Keras callbacks. This class also provides us with the flexibility of not averaging the metrics over an epoch. So, in this blog, let’s discuss this BaseLogger class in more detail.
Keras API
1
keras.callbacks.BaseLogger(stateful_metrics=None)
Similar to the History callback, this callback is also automatically applied to every Keras model with the default set of arguments. Only if you want to change the arguments, you need to apply it similar to how we applied other callbacks i.e. pass in the fit() method.
Here, stateful_metrics are the name of the metrics that you don’t want to average over an epoch. All the metrics names should be passed in the form of iterable like lists etc. For instance, stateful_metrics=[‘acc’,’loss’].
The value of the stateful_metrics will be saved as-is in on_epoch_end. In other words, the value of the stateful_metric in the last batch before the epoch end will be saved as the final value. All the other remaining metrics will be averaged over the epoch ends. Let’s take a look at the image below.
Here, I trained the model for 1 epoch on the mnist data. The stateful_metrics used is ‘acc’. Clearly, the final logged accuracy (See record.history) is similar to the last batch accuracy (0.9392) and not the average accuracy obtained on the epoch end (0.9405). Hope this makes everything clear. Let’s see how to apply baselogger callback.
1
2
3
4
5
6
7
8
9
# Load data, preprocessing and build model
...
# First, create an instance of this BaseLogger class
from keras.callbacks import BaseLogger
call=BaseLogger(stateful_metrics=['acc'])
#Then pass this as a list in the fit() method
record=model.fit(...,callbacks=[call],...)
That’s all for this blog. Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.
In the previous blog, we discussed the binary classification problem where each image can contain only one class out of two classes. So, in this blog, we will extend this to the multi-class classification problem. In multi-class problem, we classify each image into one of three or more classes. So, let’s get started.
Here, we will use the CIFAR-10 dataset, developed by the Canadian Institute for Advanced Research (CIFAR). The CIFAR-10 dataset consists of 60000 (32×32) color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The classes are completely mutually exclusive. Below are the classes in the dataset, as well as 10 random images from each class.
1. Load the Data
CIFAR-10 dataset can be downloaded by using any of the two methods:
Using Keras builtin datasets
From the official website
Method-1
Downloading using the Keras builtin datasets is pretty straightforward and simple. It’s already transformed into the shape appropriate for the CNN input. No headache, just write one line of code and you are done.
The data can also be downloaded from the official website. But the only thing is that it is not in the standard format that can be inputted directly to the model. Let’s see how the dataset is arranged.
The dataset is broken into 5 files so as to prevent your machine from running out of memory. Each file contains a dictionary of data and the corresponding labels. Data is a 10000×3072 array where 10000 is the number of images and 3072 are the pixel values in row-major order. So, the first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. You need to convert it into a (32,32) color image.
Steps:
First, unpickle all the train and test files
Then convert the image format to (width x height x num_channel)
# convert the image format to (width x height x num_channel)
b=np.reshape(train_x,(50000,3,32,32))
train_x=np.transpose(b,(0,2,3,1))
# Normalize the data between 0 and 1
train_x=train_x.astype('float32')/255
train_y=np.expand_dims(train_y,axis=-1)
Split the data into train and validation
Because the training data contains images in the random order thus simple splitting will be sufficient. Another way is to take some % of images from each of the 5 train files to constitute a validation set.
1
2
3
4
x_train=train_x[:45000]
y_train=train_y[:45000]
val_x=train_x[45000:]
val_y=train_y[45000:]
To make sure that this splitting leads to the uniform proportion of examples for each class, we can plot the counts of each class in the validation dataset. Below is the bar plot. Looks like all the classes are uniformly distributed in the validation set.
Model Architecture
Since the images contain a diverse amount of information, we will be needing a bigger network. Bigger the network more will be the chances of overfitting, So, to prevent this we may need to apply some regularization techniques.
In the previous blogs, we discussed binary and multi-class classification problems. Both of these are almost similar. The basic assumption underlying these two problems is that each image can contain only one class. For instance, for the dogs vs cats classification, it was assumed that the image can contain either cat or dog but not both. So, in this blog, we will discuss the case where more than one classes can be present in a single image. This type of classification is known as Multi-label classification. Below picture explains this concept beautifully.
Some of the most common techniques for solving multi-label classification problems are
Problem Transformation
Adapted Algorithm
Ensemble approaches
Here, we will only discuss only Binary Relevance, a method that falls under the Problem Transformation category. If you are curious about other methods, you can read this amazing review paper.
In binary relevance, we try to break the problem into a number of binary classification problems. So, now for each class available, we will ask if it is present in the image or not. As we already know that the binary classification uses ‘sigmoid‘ as the last layer activation function and ‘binary_crossentropy‘ as the loss function. So, here we will also use the same. Rest all things are the same.
Now, let’s take a dataset and see how to implement multi-label classification.
Problem Definition
Here, we will take the most common Movie Genre classificationbased on the poster images problem. Because a movie can belong to more than one genre, for instance, comedy, romance, etc. and hence is a multi-label classification problem.
Dataset
You can download the original dataset from here. This contains two files.
Movie_Poster_Dataset.zip – The poster images
Movie_Poster_Metadata.zip – Metadata of each poster image like ID, genres, box office, etc.
To prepare the dataset, we need images and corresponding genre information. For this, we need to extract the genre information from the Movie_Poster_Metadata.zip file corresponding to each poster image. Let’s see how to do this.
Note: This dataset contains some missing items. For instance, check the “1982” folder in the Movie_Poster_Dataset.zipand Movie_Poster_Metadata.zip. The number of poster images and the corresponding genre information is missing for some movies. So, we need to perform EDA and remove these files.
Steps to perform EDA:
First, we will extract the movie name and corresponding genre information from the Movie_Poster_Metadata.zip file and create a Pandas dataframe using these.
Then we will loop over the poster images in the Movie_Poster_Dataset.zip file and check if it is present in the dataframe created above. If the poster is not present, we will remove that movie from the dataframe.
These two steps will ensure that we are only left with movies that have poster images and genre information. Below is the code for this.
Because the encoding of some files is different, that’s why 2 for loops. Below are the steps performed in the code.
First, open the metadata file
Read line by line
Extract the information corresponding to the ‘Genre’ and ‘imdbID’
Append them into the list and create a dataframe
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
b=[]
b2=[]
foriinrange(1980,1982):
with open('D:/downloads/Movie_Poster_Dataset/groundtruth/{}.txt'.format(i),mode="r")asf:
forlines inf.readlines():
lines=lines.rstrip('\n')
if'imdbID'inlines:
a2,b3,c2=lines.partition(':')
c2=c2.lstrip(' "')
c2=c2.rstrip('",\n')
b2.append(c2+'.jpg')
if"Genre"inlines:
# print(lines)
a,b1,c=lines.partition(':')
c=c.lstrip(' "')
c=c.rstrip('",\n')
c1=c.split(',')
c1=map(str.strip,c1)
b.append(list(c1))
f.close()
foriinrange(1982,2016):
with open('D:/downloads/Movie_Poster_Dataset/groundtruth/{}.txt'.format(i),mode="r",encoding='utf-16-le')asf:
forlines inf.readlines():
lines=lines.rstrip('\n')
if'imdbID'inlines:
a2,b3,c2=lines.partition(':')
c2=c2.lstrip(' "')
c2=c2.rstrip('",\n')
b2.append(c2+'.jpg')
if"Genre"inlines:
a,b1,c=lines.partition(':')
c=c.lstrip(' "')
c=c.rstrip('",\n')
c1=c.split(',')
c1=map(str.strip,c1)
b.append(list(c1))
f.close()
data=pd.DataFrame({'name':b2,'filename':b})
Now for the second step, we first append all the poster images filenames in the list.
So, finally, we are ready with our cleaned dataset with 8052 images containing overall 25 classes. The dataframe is shown below.
One can also convert this dataframe into the common format as shown below
This can be done using the following code.
1
2
3
4
5
foridx,row indf.iterrows():
forhobby inrow.filename:
df.loc[idx,hobby]='1'
df.fillna('0',inplace=True)
In this post, we will be using Format 1. You can use any. Here, we will be using the Keras flow_from_dataframe method. For this, we need to place all the images under one directory. Currently, all the images are in separate folders such as 1980, 1981, etc. Below is the code that places all the poster images in a single folder ‘original_train‘.
Here, I’ve used both to show how accuracy instantly reaches 90+ from the starting epoch and thus is not a correct metric.
flow_from_dataframe()
Here, I split the data into training and validation sets using the validation_split argument of ImageDataGenerator. You can read more about the ImageDataGenerator here.
See how accuracy is reaching 90+ within few epochs. As stated earlier this is not a good evaluation metric for multi-label classification. On the other hand, top_k_categorical_accuracy is showing us the true picture.
Clearly, we are doing a pretty decent job. Considering the fact that training data is small and the complexity of the problem is large(25 classes). Moreover, some classes like comedy, etc dominate the training data. Play with the model architecture and other hyperparameters and check how the accuracy varies.
Prediction time
For each image, let’s predict the top three predicted classes. Below is the code for this.
You can see that our model is doing a decent job considering the complexity of the problem
Let’s try another example “tt0465602.jpg“. For this the predicted labels are
By looking at the poster most of us will predict the labels as predicted by our algorithm. Actually, these are pretty close to the true labels that are [Action, Comedy, Crime].
That’s all for multi-label classification problem. Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.