Keras is a high level API, can run on top of Tensorflow, CNTK and Theano. Keras is preferable because it is easy and fast to learn. In this blog we will learn a set of functions named as callbacks, used during training in Keras.
Callbacks provides some advantages over normal training in keras. Here I will explain the important ones.
Callback can terminate a training when a Nan loss occurs.
Callback can save the model after every epoch, also you can save the best model.
Early Stopping: Callback can stop training when accuracy stops improving.
Terminate the training when Nan loss occurs
Let’s see the code how to terminate when Nan loss occurs while training:
1
keras.callbacks.TerminateOnNaN()
Saving Model using Callbacks
To save model after every epoch in keras, we need to import ModelCheckpoint from keras.callbacks. Let’s see the below code which will save the model if validation loss decreases.
In the above code first we have created a ModelCheckpoint object by passing its required parameters.
“filepath” defines the path where all checkpoints will be saved. If you want to save only the best model, then directly pass filepath with name “best_model.hdf5” which will overwrite the previous saved checkpoints.
“monitor” will decide which quantity you want to monitor while training.
“save_best_only” only saves if validation loss decreases.
mode with {auto, min, max} when chosen max, will stop training when monitored quantity stops increasing.
Then finally make a callback list and pass it into model.fit() with parameter callbacks.
Early Stopping
Callbacks can stop training when a monitored quantity has stopped improving. Lets see how:
A screensaver is a computer program that fills the screen with anything you wish when the computer is left idle for some time. Most of you might have used a screensaver on your laptops, TV etc. In the good old days, they used to fascinate most of us. In this blog, we will be creating a bouncing ball screensaver using OpenCV-Python.
Task:
Create a Window that we can write text on. If we don’t write for 10 seconds screensaver will start.
For this we need to do two things:
First, we need to check whether a key is pressed in the specified time. Here, I have used 10 sec.
Second, create a bouncing ball screensaver and display it only if no key is pressed in the specified time, otherwise, display the original screen.
The first part can be done using the OpenCV cv2.waitKey() function which waits for a specific time for a key press (See here for more details).
For the second part, we first need to create a bouncing ball screensaver. The main idea is to change the sign of increment (dx and dy in the code below) on collision with the boundaries. This can be done using the following code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def screensaver():
img=np.zeros((480,640,3),dtype='uint8')
dx,dy=1,1
x,y=100,100
whileTrue:
# Display the image
cv2.imshow('a',img)
k=cv2.waitKey(10)
img=np.zeros((480,640,3),dtype='uint8')
# Increment the position
x=x+dx
y=y+dy
cv2.circle(img,(x,y),20,(255,0,0),-1)
ifk!=-1:
break
# Change the sign of increment on collision with the boundary
ify>=480:
dy *=-1
elify<=0:
dy *=-1
ifx>=640:
dx *=-1
elifx<=0:
dx *=-1
cv2.destroyAllWindows()
The snapshot of the screensaver looks like this
Now, we need to integrate this screensaver function with the cv2.waitKey() function as shown in the code below
Screen time of an actor in a movie or an episode is very important. Many actors get paid according to their total screen time. Moreover, we also want to know how much time our favorite character acted on screen. So, have you ever wondered how can you calculate the total screen time of an actor? One of the plausible answer is with deep learning.
With the advancement of deep learning now its possible to solve various difficult problems. In this blog, we will learn how to use transfer learning and image classification concepts of deep learning to calculate the screen time of an actor.
To solve any problem with deep learning, the first requirement is the data. For this tutorial, we will use a video clip from the famous TV show “Friends”. We are going to calculate the screen time of my favorite character “Ross”.
Creating Dataset
First, we need to get a video. To do this I have downloaded a video from YouTube using pytube library. For more understanding of pytube, you can follow this blog or use the following code to get started.
Now we have our data in the form of a video which is nothing but a group of frames( images). Since we are going to solve this problem using image classification, we need to extract the images from this video. For this task, I have used OpenCV as shown below
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Opens the Video file
cap=cv2.VideoCapture('Friends - Unagi.mp4')
i=0
image_folder='img'
whileTrue:
ret,frame=cap.read()
ifret==False:
break
cv2.imwrite(image_folder+'/'+str(i)+'.jpg',frame)
i+=1
cap.release()
cv2.destroyAllWindows()
The video is now converted into individual frames. In this problem, there is only one class, either “Ross” or “No Ross”. To create a dataset, we need to separate images according to these two manually. For this, I have created a folder named “data” which is having two sub-folder “ross” and “no_ross”. Then manually added images to these two sub-folders. After creating dataset we are ready to dive into the code and concepts.
Input Data and Preprocessing
We are having data in the form of images. To prepare this data for input to our neural network, we need to do some preprocessing with the following steps:
Read all images one by one using openCV
Resize each image to (224, 224, 3) for the input to the model
Divide the data by 255 to make input features to neural network in the same range
Since we have only 6814 images, so it will be difficult to train a neural network with this little dataset. Here comes the concept of transfer learning.
With the help of transfer learning, we can use features generated by a model trained on a large dataset into our model. Here we will use VGG16 model trained on “imagenet” dataset. For this, we are using tensorflow high-level API Keras. With keras, you can directly import VGG16 model as shown in the code below.
VGG16 model trained with imagenet dataset predicts on lots of classes, but in this problem, we are only having one class, either “Ross” or “No Ross”. That’s why above we are using include_top = False, which signifies that we are not including fully connected layers from the VGG16 model. Now we will pass our input data to vgg_model and generate the features.
1
2
vgg_class1=vgg_model.predict(class1_data)
vgg_class2=vgg_model.predict(class2_data)
Network Architectures
Since we are not including fully connected layers from VGG16 model, we need to create a model with some fully connected layers and an output layer with 1 class, either “Ross” or “No Ross”. Output features from VGG16 model will be having shape 7*7*512, which will be input shape for our model. Here I am also using dropout layer to make model less over-fit. Let’s see the code:
1
2
3
4
5
6
7
8
9
10
11
12
13
from keras.layers import Input,Dense,Dropout
from keras.models import Model
inputs=Input(shape=(7*7*512,))
dense1=Dense(1024,activation='relu')(inputs)
drop1=Dropout(0.5)(dense1)
dense2=Dense(512,activation='relu')(drop1)
drop2=Dropout(0.5)(dense2)
outputs=Dense(1,activation='sigmoid')(drop2)
model=Model(inputs,outputs)
model.summary()
Splitting Data into Train and Validation
Now we have input features from VGG16 model and our own network architecture defined above. Next thing is to train this neural network. But we are lacking our validation data. We are having 6814 images, so we will split this into 5000 training images and 1814 validation images.
All set, we are ready to train our model. Here, we will use stochastic gradient descent as an optimizer and binary cross-entropy as our loss function. We are also going to save our checkpoint for the best model according to it’s validation dataset accuracy.
Training and validation accuracy looks quite pleasing. Now let’s calculate screen time of “Ross”.
Calculating Screen Time
To test our trained model and calculate the screen time, I have downloaded another “friends” video clip from YouTube and extracted images. To calculate the screen time, first I have used the trained model to predict each image to find out which class it belongs, either “Ross” or “No Ross”. Since video is made up of 24 frames per second, we will count the number of frames which has been predicted for having “Ross” in it and then divide it by 24 to count the number of seconds “Ross” was on screen.
This test video clip is made up of 24 frames per second and number of images predicted for having “Ross” in it are 4715. So the screen time for Ross will be 4715/24 = 196 seconds.
Summary
We can see good accuracy on train and validation dataset but when I tested it on test dataset, the accuracy was about 65%. The one reason that I figured out is less training data. If you can get more data then accuracy can be higher. Another reason can be co-variance shift which means the test dataset is quite different from training dataset due to different video quality.
This type of technique can be very helpful in calculating screen time of a particular character.
Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.
In this blog, we will learn how to add an image to a live camera feed using OpenCV-Python. Also known as Image Blending. In this we take the weighted sum of two images. These weights give a feeling of blending or transparency.
Images are added as per the equation below:
Since an image is a matrix so for the above equation to satisfy, both img1 and img2 must be of equal size.
OpenCV has a built-in function that does the exact same thing as shown below
The idea is that first, we will select which image we want to overlay (another image will serve as the background). Then we need to select the region in the background image where we want to put the overlay image. Add this selected region with the overlay image using the above equation. At last change the region in the background image with the result obtained in the previous line.
I hope you understand the idea. Now, let’s get started
Task:
Overlay a white square image on the live webcam feed according to different weights. Instead of manually giving weights, set two keys which on pressing increase or decrease the weights.
Steps:
Take an image which you want to overlay. Here, I have used a small white square created using numpy. You can use any.
Open the camera using cv2.VideoCapture()
Initialize the weights (alpha).
Until the camera is opened
Read the frame using cap.read()
Select the region in the frame where we want to add the image and add the images using cv2.addWeighted()
Change the region in the frame with the result obtained
Display the current value of weights using cv2.putText()
Display the image using cv2.imshow()
On pressing ‘a’ increase the value of alpha by 0.1 and decrease by the same amount on pressing ‘d’
Press ‘q’ to break
Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import cv2
import numpy asnp
# create an overlay image. You can use any image
foreground=np.ones((100,100,3),dtype='uint8')*255
# Open the camera
cap=cv2.VideoCapture(0)
# Set initial value of weights
alpha=0.4
whileTrue:
# read the background
ret,background=cap.read()
background=cv2.flip(background,1)
# Select the region in the background where we want to add the image and add the images using cv2.addWeighted()
Most of you must have clicked the photograph with a Timer. This feature sets a countdown before clicking a photograph. In this tutorial, we will be doing the same i.e. creating our own camera timer using OpenCV-Python. Sounds interesting, so let’s get started.
The main idea is that whenever a particular key is pressed (Here, I have used ‘q’), the countdown will begin and a photo will be clicked and saved at the desired location. Otherwise the video will continue streaming.
Here, we will be using cv2.putText() function for drawing the countdown on the video. This function has the following arguments
This function draws the text on the input image at the specified position. If the specified font is unable to render any character, it is replaced by a question mark.
Now let’s see how to do this
Steps:
Open the camera using cv2.VideoCapture()
Until the camera is open
Read the frame and display it using cv2.imshow()
Set the countdown. Here, I have taken this as 30 and I am displaying it after 10 frames so that it is easily visible. Otherwise, it will be too fast. You can set it to anything as you wish
Set a key for the countdown to begin
If the key is pressed, show the countdown on the video using cv2.putText(). As the countdown finishes, save the frame at the desired location.
Otherwise, the video will continue streaming
On pressing ‘Esc’ the video will stop streaming.
Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import cv2
import time
# Open the camera
cap=cv2.VideoCapture(0)
whileTrue:
# Read and display each frame
ret,img=cap.read()
cv2.imshow('a',img)
k=cv2.waitKey(125)
# Specify the countdown
j=30
# set the key for the countdown to begin
ifk==ord('q'):
whilej>=10:
ret,img=cap.read()
# Display the countdown after 10 frames so that it is easily visible otherwise,
# it will be fast. You can set it to anything or remove this condition and put
# countdown on each frame
ifj%10==0:
# specify the font and draw the countdown using puttext
YouTube is a rich source of videos and downloading videos from YouTube is a little difficult. There are some extensions and downloaders available but those are sometimes not recommended. But in Python, it is quite easy to download video from YouTube.
So, in this blog we will learn how to download videos from YouTube directly using video link.
To download videos from YouTube, we will use a python library named ‘pytube’. First you need to install ‘pytube’ using following command.
1
pip install pytube
Now we are having all necessary libraries required for this task. Now import the required module from ‘pytube’ library.
1
from pytube import YouTube
The only thing that you will need from video is it’s video link.
Then create an object of the earlier imported module ‘YouTube’ using this video link.
1
vid=YouTube(video_link)
After getting object we need to get stream and from that get the first result of the stream which will be our required video.
1
stream=vid.streams.first()
Now we have our video stream, to download it just call the following function.
1
stream.download()
The above code will download the video in the current working directory with the name of the video as it was in the YouTube. To download it to a specific folder with specific name you need pass two arguments in the above code as below.
Have you seen the security cameras output where DateTime continuously keeps updating? In this blog, we will be doing the same using OpenCV-Python i.e. we will put current DateTime on the live webcam feed. So, let’s get started.
For fetching current DateTime, we will be using Python’s DateTime module. The following code shows how to get the current DateTime
1
2
3
# get current DateTime
from datetime import datetime
print(datetime.now())
To put the DateTime on the live video, we will be using cv2.putText() on each frame as shown below
Web scraping is a way to extract information from the internet in an automated fashion. We all know that YouTube is a huge resource of data having tons of videos with their relative information’s like views, comments, etc.In this blog we will learn how to use web scraping in python to extract video information from YouTube search. In video information we will extract number of views and video heading appeared in search results.
To get started with this, we first need to install two important libraries. First is ” requests ” to get the response from a YouTube search result and other is ” Beautiful Soup ” to parse this response into html content.
1
2
pip install requests
pip install-Ubs4
Now we have install the required libraries, let’s get started.
Import the libraries
1
2
from bs4 import BeautifulSoup asbs
import requests
Whenever you search in YouTube, it creates a base search URL and then adds your search query into that URL to complete the it. Let say we search ” theailearner ” in the YouTube. Base search URL and query can be defined as follows.
Now, we will scrape the data from this URL using ” requests ” library.
1
2
response=requests.get(URL)
page=response.text
Once we scraped the data, we will parse it into HTML using beautiful soup and find all the videos information resulted in search result. To extract particular information we will use particular class from HTML data.
The above used soup.findall() function will give the required data, but to make it easily understandable we need to run a simple python script.
1
2
3
4
5
6
7
8
9
10
11
12
13
forvinvids:
print(v['title'])
v=str(v)
views=''
try:
indx=v.index('views')
indx=indx-2
whilev[indx]isnot' ':
views=views+v[indx]
indx=indx-1
print(views[::-1])
except:
continue
Now you might have got some feeling about how to scrape data from YouTube. We can also scrape the other data from YouTube like video information from a channel, comments in a video, likes and dislikes and etc.
Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.
In this blog, we will learn how to use OpenCV cv2.putText() function for writing text on images in real-time. Most of you might have used cv2.putText(), if not this is how it looks like.
The above function draws the text on the input image at the specified position. If the specified font is unable to render any character, it is replaced by a question mark.
Another important thing that we will be using is OpenCV cv2.waitKey() function. This returns -1 when no key is pressed otherwise returns the ASCII value of the key pressed or a 32-bit integer value depending upon the platform or keyboard modifier(Num lock etc.). You can find this by printing the key as shown below.
1
2
3
4
5
6
7
8
9
10
import cv2
import numpy asnp
img=np.zeros((500,500,3),dtype='uint8')# Create a dummy image
whileTrue:
cv2.imshow('a',img)
k=cv2.waitKey(0)
print(k)
ifk==ord('q'):
break
cv2.destroyAllWindows()
If it returns a 32-bit integer, then use cv2.waitKey() & 0xFF which leaves only the last 8 bits of the original 32 bit.
ord(‘q’) converts the character to an int while chr(113) does exactly the opposite as shown in the code below.
1
2
3
4
>>>ord('q')
113
>>>chr(113)
'q'
I hope you understand all this, now let’s get started
Steps:
Read the image and initialize the counter that will be used for changing the position of the text.
Inside an infinite while loop,
display the image and use cv2.waitKey() for a keypress.
Convert this key into character using chr() and draw it on the image using cv2.putText().
In the previous blog, we discussed how to write text on images in real-time. In that, we manually specified the position for text placement. This is quite tedious if we were to write text at multiple positions.
So, what if we automate this process. That is we automatically get the coordinates of the image where we click and then put text at that position using cv2.putText() function as we did in the previous blog.
This is what we will do in this blog i.e. write text on images at mouse click position. To do this, we will create a mouse callback function and then bind this function to the image window.
Mouse callback function is executed whenever a mouse event takes place. Mouse event refers to anything we do with the mouse like double click, left click etc. All available events can be found using the following code
1
2
3
import cv2
events=[iforiindir(cv2)if'EVENT'ini]
print events
Below is an example of a simple mouse callback function that draws a circle where we double click.
1
2
3
4
# mouse callback function
def draw_circle(event,x,y,flags,param):
ifevent==cv2.EVENT_LBUTTONDBLCLK:
cv2.circle(img,(x,y),100,(255,0,0),-1)
We then need to bind this callback function to the image window. This is done using cv2.setMouseCallback(window_name, mouse_callback_function) as shown below
1
cv2.setMouseCallback('img',draw_circle)
I hope you understood mouse callback function, now let’s get started
Steps:
Create a mouse callback function where on every left double click position we put text on the image.
Create or read an image.
Create an image window using cv2.namedWindow()
Bind the mouse callback function to the image window using cv2.setMouseCallback()
Display the new image using an infinite while loop
Code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import cv2
import numpy asnp
font=cv2.FONT_HERSHEY_SIMPLEX
# mouse callback function
def draw_circle(event,x,y,flags,param):
ifevent==cv2.EVENT_LBUTTONDBLCLK:
i=0
whileTrue:
cv2.imshow('image',img)# to display the characters