Category Archives: Image Processing

Geometric Transformation of images using OpenCV-Python

Before reading, please refer to this blog for better understanding.

In this blog, we will discuss how to perform a geometric transformation using OpenCV-Python. In geometric transformation, we move the pixels of an image based on some mathematical formulae. This involves translation, rotation, scaling, and distortion (or undistortion!) of images. This is frequently used as a pre-processing step in many applications where the input is distorted while capturing like document scanning, matching temporal images in remote sensing and many more.

There are two basic steps in geometric transformation

  • Spatial Transformation: Calculating the spatial position of pixels in the transformed image.
  • Intensity Interpolation: Finding the intensity values at the newly calculated positions.

OpenCV has built-in functions to apply the different geometric transformations to images like translation, rotation, affine transformation, etc. You can find all the functions here: Geometric Transformations of Images

In this blog, we will learn how to change the apparent perspective of an image. This will make the image look more clear and easy to read. Below image summarizes what we want to do. See how easily we can read the words in the corrected image.

For perspective transformation, we need 4 points on the input image and corresponding points on the output image. The points should be selected counterclockwise. From these points, we will calculate the transformation matrix which when applied to the input image yields the corrected image. Let’s see the steps using OpenCV-Python

Steps:

  • Load the image
  • Convert the image to RGB so as to display via matplotlib
  • Select 4 points in the input image (counterclockwise, starting from the top left) by using matplotlib interactive window.
  • Specify the corresponding output coordinates.
  • Compute the perspective transform M using cv2.getPerspectiveTransform()
  • Apply the perspective transformation to the input image using cv2.warpPerspective() to obtain the corrected image.

Code:

By running the above code you will get an interactive matplotlib window popup. Now select any four points(better to select corner points) for the inputs. Then specify the corresponding output points.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Spatial Filtering

In the previous blogs, we discussed Intensity Transformation, a point processing technique for image enhancement. In this blog, we will discuss another image enhancement method known as Spatial Filtering, that transforms the intensity of a pixel according to the intensities of the neighboring pixels.

First let’s discuss what is a spatial filter?

The spatial filter is a window with some width and height that is usually much less than that of the image. Mostly 3×3, 5×5 or 7×7 size filters are used. The values in the filter are called coefficients or weights. There are other terms to call filters such as mask, kernel, template, or window. A 3×3 spatial filter is shown below

Now, let’s see the mechanism of Spatial Filtering.

The spatial filtering can be characterized as a ‘shift-and-multiply’ operation. First, we place the filter over a portion of an image. Then we multiply the filter weights (or coefficients) with the corresponding image pixel values, sum these up. The center image pixel value is then replaced with the result obtained. Then shift the filter to a new location and repeat the process again.

For the corner image pixels, we pad the image with 0’s. The whole process is shown below where a 3×3 filter is convolved with a 5×5 input image (blue color below) to produce a 7×7 output image.

This process is actually known as “correlation” but here, we refer to this as “convolution” operation. This should not be confused with mathematics convolution.

Note: The mathematics convolution is similar to correlation except that the mask is first flipped both horizontally and vertically.

Mathematically, the result of convolving a filter mask “w” of size mxn with an image “f” of size MxN is given by the expression

Here, we assume that filters are of odd size thus m=2a+1 and n=2b+1, where a and b are positive integers.

Let’s see how to do this using Python

Python Code

Again remember that this function does actually compute the correlation, not the convolution. If you need a real convolution, flip the kernel both horizontally and vertically and then apply the above function.

If you want the output image to be of the same size as that of the input, then you must change the padding as shown below

You can also do this using scipy or other libraries.

OpenCV

OpenCV has a builtin function cv2.filter2D() to convolve a kernel with an image. It’s arguments are

  • src: input image
  • ddepth: desired depth of the output image. If it is negative, it will be the same as that of the input image.
  • borderType: pixel extrapolation method.

This returns the output image of the same size and the same number of channels as the input image. Depending on the border type, you may get different outputs.

Hope you enjoy reading. In the next blog, we will learn how to do image smoothing or blurring by just changing the filter weights.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Histogram Backprojection

In this blog, we will discuss Histogram Backprojection, a technique that is used for image segmentation or finding objects of interest in an image. It was proposed by Michael J. Swain, Dana H. Ballard in their paper Indexing via color histograms, Third international conference on computer vision,1990.

This was one of the first works that use color to address the classic problem of Classification and Localisation in computer vision.

To understand this technique, knowledge of histograms (particularly 2-D histograms) is a must. If you haven’t encountered 2-d histograms yet, I suggest you to read What is a 2-D histogram?

Now, let’s see what is Histogram backprojection and how do we do it?

According to the authors, Histogram Backprojection answers the question

“Where are the colors in the image that belong to the object being looked for (the target)?”

So, this addresses the localization problem i.e. where is the object in an image. In this, we calculate the histogram model of a feature and then use it to find this feature in an image. To know why this is named as Histogram Backprojection, you need to know how this method works. So, let’s see how to do this

Suppose we want to find the green color in the target image. Let ‘roi’ be the image of the object we need to find and ‘target’ be the image where we are going to search that object.

Steps:

  • First, load the images, convert them into HSV and find the histograms as shown below

The 2-D histograms looks like this

This was expected, as roi image has shades of green so its histogram is mostly concentrated to the H and S values representing green. Similarly we can argue for the target image.

Now, what we want is to make ‘I’ look as similar to ‘M‘ as possible. Only then we will be able to extract the green color from the target image. So, let’s see how to do this.

One plausible solution is to divide “M” by “I”. This way the output will have values greater than 0, where both “M” and “I” are greater than 0. For all other cases, it will be either ‘0’ or ‘Nan’.

The output ‘R’ is shown below where cyan represents values greater than 0 (probably our roi), purple represents 0 and white represents Nan.

R, 2-D Histogram

Now the last thing to do is, find the pixels in the target image that corresponds to the cyan region shown above. In other words, we back project the 2-D histogram.

Because we know the “H” and “S” values for the cyan region (See R image above), we can easily find out the pixels with similar “H” and “S” values in the target image. Let’s see how to do this.

First, we will extract “H” and “S” channels from the target image.

For each pixel in the target image, using the “H” and “S” value for that pixel we will find the corresponding value in the 2-D histogram and save that value in ‘B’.

Note: “B” doesn’t contain “Nan” values. Remember, “Nan” occurs when both “M” and “I” equals to 0.

We keep the values between 0 and 1 so that the value can be treated as the probability of each pixel belonging to the target. After that, we resize “B”.

So, now we have created a new image “B” (size same as that of the target image) where every pixel value represents the corresponding probability of being the target. Brighter pixels are more probable of being the target. “B” is shown below

  • Now, the next step is just for fine-tuning this output. This varies from image to image.
  • Use thresholding to segment out the region and Overlay images using bitwise_and to produce the desired output.

The output looks like this

See how from a 2-D histogram we are able to extract the roi from the target image.

Backprojection in OpenCV

OpenCV provides an inbuilt function

cv2.calcBackProject( target_img, channels, roi_hist, ranges, scale )

  • target_img: image where you want to find the feature.
  • channels: The list of channels used to compute the back projection.
  • roi_hist: histogram of the feature you want to find.
  • ranges: histogram bin boundaries in each dimension.
  • scale: Optional scale factor for the output back projection.

This returns the probability image “B”.

So, we only need to calculate the roi histogram (M) and normalize it. No need of calculating “I” and “R”. This function directly output “B”.

After that apply all the fine-tuning steps that we did earlier.

That’s all about Histogram Backprojection. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time

Adaptive Histogram Equalization (AHE)

In the previous blog, we discussed Histogram Equalization which considers the global contrast of an image. This means that the same transformation function is used to transform all the image pixels. This approach works well for most cases but when the image contains regions that are significantly lighter or darker than most of the image, the contrast in those regions will not be sufficiently enhanced. See the face of the statue in the image below

And sometimes we want to enhance details over small areas in an image rather than the whole image. This problem can be solved if we use a transformation function that is derived from the neighborhood of every pixel in the image. This is what Adaptive Histogram Equalization (AHE) do.

In Adaptive Histogram Equalization (AHE), the image is divided into small blocks called “tiles” (e.g. 64 tiles (8×8) is a common choice). Then each of these blocks is histogram equalized as we did earlier. Finally, we stitch these blocks together using bilinear interpolation.

But this method has a problem. If the pixel values are more or less constant in a block with some noise then the AHE tends to over-amplify the noise. To avoid this, contrast limiting is applied and the method is known as Contrast Limited Adaptive Histogram Equalization (CLAHE).

In CLAHE, we clip the histogram at a predefined value before computing the CDF and are distributed uniformly to other bins before applying histogram equalization as shown in the figure below.

Source: Wikipedia

Since the transformation function used in the Histogram Equalization is proportional to the CDF so clipping results in limiting the slope of the CDF and therefore of the transformation function. This way it prevents the noise from being overamplified.

Since for each pixel we are calculating the transformation function from its neighborhood, this is a computationally expensive process. To overcome this, we only compute the transformation function for each block’s center pixel and all the remaining pixels are transformed wrt. these center pixels using interpolation (bilinear or linear depending on the pixel location).

Another good approach is using Sliding Window Adaptive Histogram Equalization (SWAHE) where we slide the window one pixel at a time and incrementally update the histogram for each pixel.

So, let’s summarise the algorithm for CLAHE

CLAHE Algorithm

  • Divide the image into blocks or tiles (8×8 is common)
  • Plot the histogram and check whether to clip or not.
  • CDF and transformation function is then computed for each of the blocks. This transformation function is only appropriate for the block’s center pixel.
  • All the remaining pixels are transformed wrt. these center pixels using interpolation.

I hope you understood Adaptive Histogram Equalization and its variants. Now let’s see how to do this using OpenCV-Python

Code:

The output looks like this

Compare the CLAHE output image with the Histogram Equalized image and see the difference.

Note: To apply CLAHE on color(RGB) images, first, convert them into colorspaces where you have separate color and greyscale components like HSV or LAB and then apply CLAHE on the greyscale component like L or V. After that again convert it into RGB.

I hope this information will help you. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Histogram Matching (Specification)

In the previous blog, we discussed Histogram Equalization that tries to produce an output image that has a uniform histogram. This approach is good but for some cases, this does not work well. One such case is when we have skewed image histogram i.e. large concentration of pixels at either end of greyscale.

One reasonable approach is to manually specify the transformation function that preserves the general shape of the original histogram but has a smoother transition of intensity levels in the skewed areas.

So, in this blog, we will learn how to transform an image so that its histogram matches a specified histogram. Also known as histogram matching or histogram Specification.

Histogram Equalization is a special case of histogram matching where the specified histogram is uniformly distributed.

First let’s understand the main idea behind histogram matching.

We will first equalize both original and specified histogram using the Histogram Equalization method. As we know that the transformation function is invertible, so by inverting we can get the mapping from original to specified histogram. The whole operation is shown in the below image

For example, suppose the pixel value 10 in the original image gets mapped to 20 in the equalized image. Then we will see what value in Specified image gets mapped to 20 in the equalized image and let’s say that this value is 28. So, we can say that 10 in the original image gets mapped to 28 in the specified image.

Most of you might be thinking why both original and specified histogram on equalization converges to same uniform histogram.

This is true only if we assume continuous intensity values. But in reality, the intensity values are discrete thus both original and specified histograms may not map to the same histogram on equalization. That’s why Histogram matching is not able to perfectly match the specified histogram.

Let’s take an example where we want to match the original image with the specified image, both histograms are shown below.

Here, I am taking the original image from the histogram equalization blog. All the steps of equalization are explained in this blog. Here, I will only show the final table

Original Image Histogram Equalization

Specified Image Histogram Equalization

After equalizing both the images, we need to perform a mapping from original to equalized to the specified image. For that, we need only the round columns of the original and specified image as shown below.

Pick one by one the values from the round column of the original image, find it in the round column of the specified image and note down the index. For example for 3 in the round original, we have 3 in the round specified column (with index 1) so we map it to 1.

If the value doesn’t exist then find the index of its nearest one. For example for 0 in round original, 1 is the nearest in round specified column (with index 0) so we map it to 0.

If multiple nearest values exist then pick the one which is greater than the value. For example for 2 in the round original, there are 2 closest values in round specified i.e. 1 and 3 so we pick 3 (with index 1) so we map it to 1.

After obtaining the Map column, replace the values in the original image with the map values. This is the final result.

The matched histogram(shown on left) approximately matches with the specified histogram(shown on right) as shown below

Now, let’s see how to perform Histogram matching using OpenCV-Python

Code

Note: Specified image can have different dimensions as compared to the original image.

The output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Histogram Equalization

In the previous blog, we discussed contrast stretching, a linear contrast enhancement method. In this blog, we will learn Histogram Equalization which automatically increase the dynamic range based on the information available in the histogram of the input image.

Histogram Equalization, as the name suggests, stretches the histogram to fill the dynamic range and at the same time tries to keep the histogram uniform as shown below

Source: Wikipedia

By doing this, the resultant image will have an appearance of high contrast and exhibits a large variety of grey tones.

Mostly we will not be able to perfectly equalize the histogram. This is only possible if we assume continuous intensity values.. But in reality, intensity values are discrete thus perfectly flat histograms are rare in practical applications of the histogram equalization.

The transformation function used in this is

where ‘s’ and ‘r’ are the output and input pixel intensities respectively. ‘L’ is the maximum intensity value(for n bit image L = 2n). The probability of occurrence of the intensity level rj in the image is approximated by

Here, MN is the total number of pixels in the image and nj is the number of pixels that have intensity rj.

Now, let’s take an example to understand how to perform Histogram Equalisation using the above equations.

Suppose we have a 3-bit, 8×8 image whose pixel count and corresponding histogram is shown below

Now, using the above transformation function we calculate the equalized intensity values. For instance

Doing this for all values we get

Because the pixel values can only be integers so we round the last column(sk) to the nearest integer as shown below

So, the round column is the output pixel intensity. The last step is to replace the pixel values in the original image( rk column) with the round column values. For example, replace 0 with 0, 1 with 1, 2 with 1 and so on. This results in the histogram equalized image.

To plot the histogram, count the total pixels belonging to the rounded intensity values(See Round and nk column). For example, 2 pixels belonging to 0, 8 pixels for 1, 6 pixels for 2 and so on.

The initial and equalized histogram is shown below

Sometimes rounding to nearest integer yield non-zero minimum value. If we want the output to range from say [0,255] for 8-bit, then we need to apply stretching (as we did in Min-Max stretching) after rounding.

Histogram Equalization often produces unrealistic effects in photographs and reduce color depth(no. of unique grey levels) as shown in the example above(See pixel value 5). It works best when applied to images with much higher color depth.

Let’s see OpenCV function for Histogram Equalization

Its input is grayscale image and output is our histogram equalized image.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Overlays using Bitwise Operations OpenCV-Python

In the previous blog, we learned how to overlay an image to another image using OpenCV cv2.addWeighted() function. But this approach is limited to rectangular ROI. In this blog, we will learn how to overlay non-rectangular ROI to another image.

Task:

Put the TheAILearner text image(shown in the left) above an image (Right one).

Because the TheAILearner text is non-rectangular, we will be using OpenCV cv2.bitwise_and(img1, img2, mask) where the mask is an 8-bit single channel array, that specifies elements of the output array to be changed.

For Bitwise_and you need to know the following two rules

  • Black + Any Color = Black
  • White + Any Color = That Color

Now, let’s see step by step how to do this

  • First load the two images
  • Select the region in the image where you want to put the logo. Here, I am putting this in the top left corner.
  • Now, we will create a mask. You can create a mask by a number of ways but here we will be using thresholding for this as shown in the code below. We will also create an inverse mask. Depending on the image you need to change the thresholding function parameters.

The mask and mask_inv looks like this

  • Now black out the area of logo in the roi created above using the bitwise_and as shown in the code below

This looks like this

  • Now, we will extract the logo region (with colors) from the logo image using the following code

The output looks like this

  • Now, we will simply add the above two images because black has intensity 0 so adding this doesn’t change anything and outputs the same color. This is done using the following code

The final output looks like this

So, using these simple bitwise operations we can overlay an image to another. Be careful while creating the mask as it entirely depends on the image. According to the image you need to make adjustments to the thresholding function parameters.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Creating a Bouncing Ball Screensaver using OpenCV-Python

A screensaver is a computer program that fills the screen with anything you wish when the computer is left idle for some time. Most of you might have used a screensaver on your laptops, TV etc. In the good old days, they used to fascinate most of us. In this blog, we will be creating a bouncing ball screensaver using OpenCV-Python.

Task:

Create a Window that we can write text on. If we don’t write for 10 seconds screensaver will start.

For this we need to do two things:

  • First, we need to check whether a key is pressed in the specified time. Here, I have used 10 sec.
  • Second, create a bouncing ball screensaver and display it only if no key is pressed in the specified time, otherwise, display the original screen.

The first part can be done using the OpenCV cv2.waitKey() function which waits for a specific time for a key press (See here for more details).

For the second part, we first need to create a bouncing ball screensaver. The main idea is to change the sign of increment (dx and dy in the code below) on collision with the boundaries. This can be done using the following code

The snapshot of the screensaver looks like this

Now, we need to integrate this screensaver function with the cv2.waitKey() function as shown in the code below

You need to set the size of the screensaver and background image to be the same. The output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Add image to a live camera feed using OpenCV-Python

In this blog, we will learn how to add an image to a live camera feed using OpenCV-Python. Also known as Image Blending. In this we take the weighted sum of two images. These weights give a feeling of blending or transparency.

Images are added as per the equation below:

Since an image is a matrix so for the above equation to satisfy, both img1 and img2 must be of equal size.

OpenCV has a built-in function that does the exact same thing as shown below

The idea is that first, we will select which image we want to overlay (another image will serve as the background). Then we need to select the region in the background image where we want to put the overlay image. Add this selected region with the overlay image using the above equation. At last change the region in the background image with the result obtained in the previous line.

I hope you understand the idea. Now, let’s get started

Task:

Overlay a white square image on the live webcam feed according to different weights. Instead of manually giving weights, set two keys which on pressing increase or decrease the weights.

Steps:

  • Take an image which you want to overlay. Here, I have used a small white square created using numpy. You can use any.
  • Open the camera using cv2.VideoCapture()
  • Initialize the weights (alpha).
  • Until the camera is opened
    • Read the frame using cap.read()
    • Select the region in the frame where we want to add the image and add the images using cv2.addWeighted()
    • Change the region in the frame with the result obtained
    • Display the current value of weights using cv2.putText()
    • Display the image using cv2.imshow()
    • On pressing ‘a’ increase the value of alpha by 0.1 and decrease by the same amount on pressing ‘d’
    • Press ‘q’ to break

Code:

See the change in transparency by pressing keys ‘a’ and ‘d’. The output looks like this

You might encounter wrong values of alpha being displayed. This is because of Python’s floating point limitations.

Hope you enjoy reading. In the next blog, we will learn how to do the same for the non-rectangular region of interest.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Set Camera Timer using OpenCV-Python

Most of you must have clicked the photograph with a Timer. This feature sets a countdown before clicking a photograph. In this tutorial, we will be doing the same i.e. creating our own camera timer using OpenCV-Python. Sounds interesting, so let’s get started.

The main idea is that whenever a particular key is pressed (Here, I have used ‘q’), the countdown will begin and a photo will be clicked and saved at the desired location. Otherwise the video will continue streaming.

Here, we will be using cv2.putText() function for drawing the countdown on the video. This function has the following arguments

This function draws the text on the input image at the specified position. If the specified font is unable to render any character, it is replaced by a question mark.

Now let’s see how to do this

Steps:

  • Open the camera using cv2.VideoCapture()
  • Until the camera is open
    • Read the frame and display it using cv2.imshow()
    • Set the countdown. Here, I have taken this as 30 and I am displaying it after 10 frames so that it is easily visible. Otherwise, it will be too fast. You can set it to anything as you wish
    • Set a key for the countdown to begin
    • If the key is pressed, show the countdown on the video using cv2.putText(). As the countdown finishes, save the frame at the desired location.
    • Otherwise, the video will continue streaming
  • On pressing ‘Esc’ the video will stop streaming.

Code:

The output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.