Author Archives: kang & atul

Monitoring Training in Keras: Callbacks

Keras is a high level API, can run on top of Tensorflow, CNTK and Theano. Keras is preferable because it is easy and fast to learn. In this blog we will learn a set of functions named as callbacks, used during training in Keras.

Callbacks provides some advantages over normal training in keras. Here I will explain the important ones.

  • Callback can terminate a training when a Nan loss occurs.
  • Callback can save the model after every epoch, also you can save the best model.
  • Early Stopping: Callback can stop training when accuracy stops improving.

Terminate the training when Nan loss occurs

Let’s see the code how to terminate when Nan loss occurs while training:

Saving Model using Callbacks

To save model after every epoch in keras, we need to import ModelCheckpoint from keras.callbacks. Let’s see the below code which will save the model if validation loss decreases.

In the above code first we have created a ModelCheckpoint object by passing its required parameters.

  • filepath” defines the path where all checkpoints will be saved. If you want to save only the best model, then directly pass filepath with name “best_model.hdf5” which will overwrite the previous saved checkpoints.
  • monitor” will decide which quantity you want to monitor while training.
  • save_best_only” only saves if validation loss decreases.
  • mode with {auto, min, max} when chosen max, will stop training when monitored quantity stops increasing.

Then finally make a callback list and pass it into model.fit() with parameter callbacks.

Early Stopping

Callbacks can stop training when a monitored quantity has stopped improving. Lets see how:

  • min_delta: It is the minimum quantity which will be taken for improvement to be conceded.
  • patience: after this number of epochs if training does not improve, it will stop.
  • mode: in auto mode, the direction will be decided by monitored quantity.
  • baseline” the baseline value over which no improvement will stop the training.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Creating a Bouncing Ball Screensaver using OpenCV-Python

A screensaver is a computer program that fills the screen with anything you wish when the computer is left idle for some time. Most of you might have used a screensaver on your laptops, TV etc. In the good old days, they used to fascinate most of us. In this blog, we will be creating a bouncing ball screensaver using OpenCV-Python.

Task:

Create a Window that we can write text on. If we don’t write for 10 seconds screensaver will start.

For this we need to do two things:

  • First, we need to check whether a key is pressed in the specified time. Here, I have used 10 sec.
  • Second, create a bouncing ball screensaver and display it only if no key is pressed in the specified time, otherwise, display the original screen.

The first part can be done using the OpenCV cv2.waitKey() function which waits for a specific time for a key press (See here for more details).

For the second part, we first need to create a bouncing ball screensaver. The main idea is to change the sign of increment (dx and dy in the code below) on collision with the boundaries. This can be done using the following code

The snapshot of the screensaver looks like this

Now, we need to integrate this screensaver function with the cv2.waitKey() function as shown in the code below

You need to set the size of the screensaver and background image to be the same. The output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Calculating Screen Time of an Actor using Deep Learning

Screen time of an actor in a movie or an episode is very important. Many actors get paid according to their total screen time. Moreover, we also want to know how much time our favorite character acted on screen. So, have you ever wondered how can you calculate the total screen time of an actor? One of the plausible answer is with deep learning.

With the advancement of deep learning now its possible to solve various difficult problems. In this blog, we will learn how to use transfer learning and image classification concepts of deep learning to calculate the screen time of an actor.

To solve any problem with deep learning, the first requirement is the data. For this tutorial, we will use a video clip from the famous TV show “Friends”. We are going to calculate the screen time of my favorite character “Ross”.

Creating Dataset

First, we need to get a video. To do this I have downloaded a video from YouTube using pytube library. For more understanding of pytube, you can follow this blog or use the following code to get started.

Now we have our data in the form of a video which is nothing but a group of frames( images). Since we are going to solve this problem using image classification, we need to extract the images from this video. For this task, I have used OpenCV as shown below

The video is now converted into individual frames. In this problem, there is only one class, either “Ross” or “No Ross”. To create a dataset, we need to separate images according to these two manually. For this, I have created a folder named “data” which is having two sub-folder “ross” and “no_ross”. Then manually added images to these two sub-folders. After creating dataset we are ready to dive into the code and concepts.

Input Data and Preprocessing

We are having data in the form of images. To prepare this data for input to our neural network, we need to do some preprocessing with the following steps:

  • Read all images one by one using openCV
  • Resize each image to (224, 224, 3) for the input to the model
  • Divide the data by 255 to make input features to neural network in the same range
  • Append to corresponding class

Transfer Learning

Since we have only 6814 images, so it will be difficult to train a neural network with this little dataset. Here comes the concept of transfer learning.

With the help of transfer learning, we can use features generated by a model trained on a large dataset into our model. Here we will use VGG16 model trained on “imagenet” dataset. For this, we are using tensorflow high-level API Keras. With keras, you can directly import VGG16 model as shown in the code below.

VGG16 model trained with imagenet dataset predicts on lots of classes, but in this problem, we are only having one class, either “Ross” or “No Ross”. That’s why above we are using include_top = False, which signifies that we are not including fully connected layers from the VGG16 model. Now we will pass our input data to vgg_model and generate the features.

Network Architectures

Since we are not including fully connected layers from VGG16 model, we need to create a model with some fully connected layers and an output layer with 1 class, either “Ross” or “No Ross”. Output features from VGG16 model will be having shape 7*7*512, which will be input shape for our model. Here I am also using dropout layer to make model less over-fit. Let’s see the code:

Splitting Data into Train and Validation

Now we have input features from VGG16 model and our own network architecture defined above. Next thing is to train this neural network. But we are lacking our validation data. We are having 6814 images, so we will split this into 5000 training images and 1814 validation images.

According to our created class 1, class 2, training and validation data, we will create our output y labels.

Training the Network

All set, we are ready to train our model. Here, we will use stochastic gradient descent as an optimizer and binary cross-entropy as our loss function. We are also going to save our checkpoint for the best model according to it’s validation dataset accuracy.

I am using batch size of 64 and 10 epochs to train.

Training and validation accuracy looks quite pleasing. Now let’s calculate screen time of “Ross”.

Calculating Screen Time

To test our trained model and calculate the screen time, I have downloaded another “friends” video clip from YouTube and extracted images. To calculate the screen time, first I have used the trained model to predict each image to find out which class it belongs, either “Ross” or “No Ross”. Since video is made up of 24 frames per second, we will count the number of frames which has been predicted for having “Ross” in it and then divide it by 24 to count the number of seconds “Ross” was on screen.

This test video clip is made up of 24 frames per second and number of images predicted for having “Ross” in it are 4715. So the screen time for Ross will be 4715/24 = 196 seconds.

Summary

We can see good accuracy on train and validation dataset but when I tested it on test dataset, the accuracy was about 65%. The one reason that I figured out is less training data. If you can get more data then accuracy can be higher. Another reason can be co-variance shift which means the test dataset is quite different from training dataset due to different video quality.

This type of technique can be very helpful in calculating screen time of a particular character.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Add image to a live camera feed using OpenCV-Python

In this blog, we will learn how to add an image to a live camera feed using OpenCV-Python. Also known as Image Blending. In this we take the weighted sum of two images. These weights give a feeling of blending or transparency.

Images are added as per the equation below:

Since an image is a matrix so for the above equation to satisfy, both img1 and img2 must be of equal size.

OpenCV has a built-in function that does the exact same thing as shown below

The idea is that first, we will select which image we want to overlay (another image will serve as the background). Then we need to select the region in the background image where we want to put the overlay image. Add this selected region with the overlay image using the above equation. At last change the region in the background image with the result obtained in the previous line.

I hope you understand the idea. Now, let’s get started

Task:

Overlay a white square image on the live webcam feed according to different weights. Instead of manually giving weights, set two keys which on pressing increase or decrease the weights.

Steps:

  • Take an image which you want to overlay. Here, I have used a small white square created using numpy. You can use any.
  • Open the camera using cv2.VideoCapture()
  • Initialize the weights (alpha).
  • Until the camera is opened
    • Read the frame using cap.read()
    • Select the region in the frame where we want to add the image and add the images using cv2.addWeighted()
    • Change the region in the frame with the result obtained
    • Display the current value of weights using cv2.putText()
    • Display the image using cv2.imshow()
    • On pressing ‘a’ increase the value of alpha by 0.1 and decrease by the same amount on pressing ‘d’
    • Press ‘q’ to break

Code:

See the change in transparency by pressing keys ‘a’ and ‘d’. The output looks like this

You might encounter wrong values of alpha being displayed. This is because of Python’s floating point limitations.

Hope you enjoy reading. In the next blog, we will learn how to do the same for the non-rectangular region of interest.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Set Camera Timer using OpenCV-Python

Most of you must have clicked the photograph with a Timer. This feature sets a countdown before clicking a photograph. In this tutorial, we will be doing the same i.e. creating our own camera timer using OpenCV-Python. Sounds interesting, so let’s get started.

The main idea is that whenever a particular key is pressed (Here, I have used ‘q’), the countdown will begin and a photo will be clicked and saved at the desired location. Otherwise the video will continue streaming.

Here, we will be using cv2.putText() function for drawing the countdown on the video. This function has the following arguments

This function draws the text on the input image at the specified position. If the specified font is unable to render any character, it is replaced by a question mark.

Now let’s see how to do this

Steps:

  • Open the camera using cv2.VideoCapture()
  • Until the camera is open
    • Read the frame and display it using cv2.imshow()
    • Set the countdown. Here, I have taken this as 30 and I am displaying it after 10 frames so that it is easily visible. Otherwise, it will be too fast. You can set it to anything as you wish
    • Set a key for the countdown to begin
    • If the key is pressed, show the countdown on the video using cv2.putText(). As the countdown finishes, save the frame at the desired location.
    • Otherwise, the video will continue streaming
  • On pressing ‘Esc’ the video will stop streaming.

Code:

The output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Downloading Video from YouTube using Python

YouTube is a rich source of videos and downloading videos from YouTube is a little difficult. There are some extensions and downloaders available but those are sometimes not recommended. But in Python, it is quite easy to download video from YouTube.

So, in this blog we will learn how to download videos from YouTube directly using video link.

  • To download videos from YouTube, we will use a python library named ‘pytube’. First you need to install ‘pytube’ using following command.
  • Now we are having all necessary libraries required for this task. Now import the required module from ‘pytube’ library.
  • The only thing that you will need from video is it’s video link.
  • Then create an object of the earlier imported module ‘YouTube’ using this video link.
  • After getting object we need to get stream and from that get the first result of the stream which will be our required video.
  • Now we have our video stream, to download it just call the following function.

The above code will download the video in the current working directory with the name of the video as it was in the YouTube. To download it to a specific folder with specific name you need pass two arguments in the above code as below.

That was a simple code to download videos from YouTube. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Show current DateTime on live video using OpenCV-Python

Have you seen the security cameras output where DateTime continuously keeps updating? In this blog, we will be doing the same using OpenCV-Python i.e. we will put current DateTime on the live webcam feed. So, let’s get started.

For fetching current DateTime, we will be using Python’s DateTime module. The following code shows how to get the current DateTime

To put the DateTime on the live video, we will be using cv2.putText() on each frame as shown below

To know more about cv2.putText(), refer to this blog.

Above are the two things, that we will be needing for this task. I hope you understand these. Now, let’s get started

Steps:

  • Open the camera using cv2.VideoCapture()
  • Until the camera is open
    • Grab each frame using cap.read()
    • Put the current DateTime on each frame using cv2.putText() as discussed above
    • Display each frame using cv2.imshow()
  • On termination, release the webcam and destroy all windows using cap.release() and cv2.destroyAllWindows() respectively.

Code:

The snapshot of the output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Scraping Video Information from YouTube

Web scraping is a way to extract information from the internet in an automated fashion. We all know that YouTube is a huge resource of data having tons of videos with their relative information’s like views, comments, etc.In this blog we will learn how to use web scraping in python to extract video information from YouTube search. In video information we will extract number of views and video heading appeared in search results.

To get started with this, we first need to install two important libraries. First is ” requests ” to get the response from a YouTube search result and other is ” Beautiful Soup ” to parse this response into html content.

Now we have install the required libraries, let’s get started.

  • Import the libraries
  • Whenever you search in YouTube, it creates a base search URL and then adds your search query into that URL to complete the it. Let say we search ” theailearner ” in the YouTube. Base search URL and query can be defined as follows.
  • Now, we will scrape the data from this URL using ” requests ” library.
  • Once we scraped the data, we will parse it into HTML using beautiful soup and find all the videos information resulted in search result. To extract particular information we will use particular class from HTML data.
  • The above used soup.findall() function will give the required data, but to make it easily understandable we need to run a simple python script.

 

Now you might have got some feeling about how to scrape data from YouTube. We can also scrape the other data from YouTube like video information from a channel, comments in a video, likes and dislikes and etc.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Write Text on images in real-time using OpenCV-Python

In this blog, we will learn how to use OpenCV cv2.putText() function for writing text on images in real-time. Most of you might have used cv2.putText(), if not this is how it looks like.

cv2.putText(img, text, position, font, fontScale, color, thickness, lineType, bottomLeftOrigin)

The above function draws the text on the input image at the specified position. If the specified font is unable to render any character, it is replaced by a question mark.

Another important thing that we will be using is OpenCV cv2.waitKey() function. This returns -1 when no key is pressed otherwise returns the ASCII value of the key pressed or a 32-bit integer value depending upon the platform or keyboard modifier(Num lock etc.). You can find this by printing the key as shown below.

If it returns a 32-bit integer, then use cv2.waitKey() & 0xFF which leaves only the last 8 bits of the original 32 bit.

ord(‘q’) converts the character to an int while chr(113) does exactly the opposite as shown in the code below.

I hope you understand all this, now let’s get started

Steps:

  • Read the image and initialize the counter that will be used for changing the position of the text.
  • Inside an infinite while loop,
    • display the image and use cv2.waitKey() for a keypress.
    • Convert this key into character using chr() and draw it on the image using cv2.putText().
    • Increase the counter.
    • Provide the termination condition
  • On exit, destroy all windows.

Below is the code for this

The output looks like this

Hope you enjoy reading. In the next blog, we will learn how to write text on images at mouse click position.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Write Text on images at mouse click position using OpenCV-Python

In the previous blog, we discussed how to write text on images in real-time. In that, we manually specified the position for text placement. This is quite tedious if we were to write text at multiple positions.

So, what if we automate this process. That is we automatically get the coordinates of the image where we click and then put text at that position using cv2.putText() function as we did in the previous blog.

This is what we will do in this blog i.e. write text on images at mouse click position. To do this, we will create a mouse callback function and then bind this function to the image window.

Mouse callback function is executed whenever a mouse event takes place. Mouse event refers to anything we do with the mouse like double click, left click etc. All available events can be found using the following code

Below is an example of a simple mouse callback function that draws a circle where we double click.

We then need to bind this callback function to the image window. This is done using
cv2.setMouseCallback(window_name, mouse_callback_function) as shown below

I hope you understood mouse callback function, now let’s get started

Steps:

  • Create a mouse callback function where on every left double click position we put text on the image.
  • Create or read an image.
  • Create an image window using cv2.namedWindow()
  • Bind the mouse callback function to the image window using cv2.setMouseCallback()
  • Display the new image using an infinite while loop

Code:

In the above code, press ‘q’ to stop writing and left double click anywhere to again start writing.

You can play with mouse callback function using other mouse events. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.