Tag Archives: image processing

Bilateral Filtering

Till now, we have discussed various smoothing filters like Averaging, Median, Gaussian, etc. All these filters are effective in removing different types of noises but at the same time produce an undesirable side effect of blurring the edges also. So isn’t it be nice, if we somehow prevent averaging across edges, while still smoothing other regions. This is what exactly Bilateral filtering does.

Let’s first refresh some basic concepts which will be needed to understand Bilateral filtering.

I hope you are all familiar with the domain and range of any function. If not then let’s refresh these concepts. Domain and range are the set of all plausible values that the independent and dependent variables can take respectively. We all know that the image is also a function (a 2-D light intensity function F(x,y)). Thus for an image, the domain is the set of all possible pixel locations and range corresponds to all possible intensity values.

Now, let’s use these concepts to understand Bilateral filtering.

All the filters we read till now like Median, Gaussian, etc. were domain filters. This means that the filter weights are assigned using the spatial closeness (i.e. domain). This has an issue as it will blur the edges also. Let’s take an example to see how.

Below is a small 3×3 patch extracted from a large image having a diagonal edge. Because in domain filters, we are assigning filter weights according to the spatial closeness, more weights are given to the nearer pixels as compared to the distant pixels. This leads to the edge blurring. See how the central pixel value changed from 10 to 4.

Thus, domain filters doesn’t consider whether a pixel is an edge pixel or not. It just assigns weights according to spatial closeness and thus leads to edge blurring.

Now, let’s see what will happen if we consider range filters. In range filters, we assign weights according to the intensity difference. This ensures that only those pixels with similar intensity to the central pixel is considered for blurring. Because in range filtering, we are not considering the spatial relationship. So, now the similar intensity pixels that are far away from the central pixel affect the final value of the central pixel more as compared to the nearby approx. similar pixels. This makes no sense.

Thus, range filtering alone also doesn’t solve the problem of edge blurring.

Now, what if we combine both domain and range filtering. That will solve our problem. Because now, first, the domain filter will make sure that only nearby pixels (say a 3×3 window) are considered for blurring and then the range filter will make sure that the weights in this 3×3 window are given according to the intensity difference wrt. center pixel. This way it will preserve the edges. This is known as Bilateral filtering (bi for both domain and range filtering).

I hope you understood Bilateral filtering. Now, let’s see how to do this using OpenCV-Python

OpenCV-Python

OpenCV provides an inbuilt function for bilateral filtering as shown below. You can read more about it here but a short description is given below

  • If the sigma values are small (< 10), the filter will not have much effect, whereas if they are large (> 150), they will have a very strong effect, making the image look “cartoonish”.
  • Large filters (d > 5) are very slow, so it is recommended to use d=5 for real-time applications, and perhaps d=9 for offline applications that need heavy noise filtering.

Let’s take an example to understand this

There exist several extensions to this filter like the guided filter that deals with the artifacts generated by this. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Add different noise to an image

In this blog, we will discuss how we can add different types of noise in an image like Gaussian, salt-and-pepper, speckle, etc. By knowing this, you will be able to evaluate various image filtering, restoration, and many other techniques. So, let’s get started.

1. Using Scikit-image

In Scikit-image, there is a builtin function random_noise that adds random noise of various types to a floating-point image. Let’s first check the function arguments and then we will see how to implement it.

Basic syntax of the random_noise function is shown below. You can read more about the arguments in the scikit-image documentation.

This returns a floating-point image data on the range [0, 1] or [-1, 1] depending on whether the input image was unsigned or signed, respectively.

Let’s take an example to understand how to use this function

The output image with salt-and-pepper noise looks like this

You can add several builtin noise patterns, such as Gaussian, salt and pepper, Poisson, speckle, etc. by changing the ‘mode’ argument.

2. Using Numpy

Image noise is a random variation in the intensity values. Thus, by randomly inserting some values in an image, we can reproduce any noise pattern. For randomly inserting values, Numpy random module comes handy. Let’s see how

Gaussian Noise

Speckle Noise

Similarly, you can add other noises as well. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Gaussian Blurring

In the previous blog, we discussed smoothing filters. In this article, we will discuss another smoothing technique known as Gaussian Blurring, that uses a low pass filter whose weights are derived from a Gaussian function. This is perhaps the most frequently used low pass filter in computer vision applications. We will also discuss various properties of the Gaussian filter that makes the algorithm more efficient. So, let’s get started with a basic background introduction.

We already know that a digital image is obtained by sampling and quantizing the continuous signal. Thus if we were to interpolate a pixel value, more chances are that it resembles that of the neighborhood pixels and less on the distant pixels. Similarly while smoothing an image, it makes more sense to take the weighted average instead of just averaging the values under the mask (like we did in Averaging).

So, we should look for a distribution/function that assigns more weights to the nearest pixels as compared to the distant pixels. This is the motivation for using Gaussian distribution.

A 2-d Gaussian function is obtained by multiplying two 1-d Gaussian functions (one for each direction) as shown below

2-d Gaussian function with mean=0 and std. deviation= σ

Now, just convolve the 2-d Gaussian function with the image to get the output. But for that, we need to produce a discrete approximation to the Gaussian function. Here comes the problem.

Because the Gaussian function has infinite support (meaning it is non-zero everywhere), the approximation would require an infinitely large convolution kernel. In other words, for each pixel calculation, we will need the entire image. So, we need to truncate or limit the kernel size.

For Gaussian, we know that 99.3% of the distribution falls within 3 standard deviations after which the values are effectively close to zero. So, we limit the kernel size to contain only values within 3σ from the mean. This approximation generally yields a result sufficiently close to that obtained by the entire Gaussian distribution.

Note: The approximated kernel weights would not sum exactly 1 so, normalize the weights by the overall kernel sum. Otherwise, this will cause darkening or brightening of the image.

A normalized 3×3 Gaussian filter is shown below (See the weight distribution)

Later we will see how to obtain different Gaussian kernels. Now, let’s see some interesting properties of the Gaussian filter that makes it efficient.

Properties

  • First, the Gaussian kernel is linearly separable. This means we can break any 2-d filter into two 1-d filters. Because of this, the computational complexity is reduced from O(n2) to O(n). Let’s see an example
  • Applying multiple successive Gaussian kernels is equivalent to applying a single, larger Gaussian blur, whose radius is the square root of the sum of the squares of the multiple kernels radii. Using this property we can approximate a non-separable filter by a combination of multiple separable filters.
  • The Gaussian kernel weights(1-D) can be obtained quickly using the Pascal’s Triangle. See how the third row corresponds to the 3×3 filter we used above.

Because of these properties, Gaussian Blurring is one of the most efficient and widely used algorithm. Now, let’s see some applications

Applications

  • Computer Graphics
  • Before edge detection (Canny Edge Detector)
  • Before down-sampling an image to reduce the ringing effect

Now let’s see how to do this using OpenCV-Python

OpenCV-Python

OpenCV provides an inbuilt function for both creating a Gaussian kernel and applying Gaussian blurring. Let’s see them one by one.

To create a Gaussian kernel of your choice, you can use

To apply Gaussian blurring, use

This first creates a Gaussian kernel and then convolves it with the image.

Now, let’s take an example to implement these two functions. First, use the cv2.getGaussianKernel() to create a 1-D kernel. Then use the cv2.sepFilter() to apply these kernels to the input image.

The second method is quite easy to use. Just one line as shown below

Both these methods produce the same result but the second one is more easy to implement. Try using this for a different type of noises and compare the results with other techniques.

That’s all about Gaussian blurring. Hope you enjoy reading. In the next blog, we will discuss Bilateral filtering, another smoothing technique that preserves edges also.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Smoothing Filters

In the previous blog, we briefly introduced Low Pass filters. In this blog, let’s discuss them in detail. Low Pass filters (also known as Smoothing or averaging filter) are mainly used for blurring and noise reduction. Both of these can serve as a useful pre-processing step in many applications.

In general, the Low Pass filters block high-frequency parts of an image. Because noise typically consists of sharp transitions in intensity values, this results in noise reduction. But one downside is that edges are also blurred (Later we will see the blurring techniques which don’t blur the edges).

Now, let’s discuss some of the most commonly used blurring techniques

1. Averaging

In this, each pixel value in an image is replaced by the weighted average of the neighborhood (defined by the filter mask) intensity values. The most commonly used filter is the Box filter which has equal weights. A 3×3 normalized box filter is shown below

It’s a good practice to normalize the filter. This is to make sure that the image doesn’t get brighter or darker. You can also use an unnormalized box filter.

OpenCV provides two inbuilt functions for averaging namely:

  • cv2.blur() that blurs an image using only the normalized box filter and
  • cv2.boxFilter() which is more general, having the option of using either normalized or unnormalized box filter. Just pass an argument normalize=False to the function

The basic syntax of both the functions are shown below

Let’s take an example

The output looks like this

2. Median Blurring

This is a non-linear filtering technique. As clear from the name, this takes a median of all the pixels under the kernel area and replaces the central element with this median value. This is quite effective in reducing a certain type of noise (like salt-and-pepper noise) with considerably less edge blurring as compared to other linear filters of the same size.

Because we are taking a median, the output image will have no new pixel values other than that in the input image.

Note: For an even number of entries, there is more than one possible median, thus kernel size must be odd and greater than 1 for simplicity.

OpenCV provides an inbuilt function for this

Let’s take an example

The output looks like this

See how effectively median blurring is able to remove salt and pepper noise and still able to preserve the edges.

In the next blog, we will discuss Gaussian Blurring, another blurring technique which is widely used in computer graphics and is computationally very efficient. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Frequency in Images

In the previous blog, we discussed filters and convolution operation. Before moving forward, let’s discuss an important concept “Frequency”, which is widely used in spatial filtering.

Frequency in images is the rate of change of intensity values. Thus, a high-frequency image is the one where the intensity values change quickly from one pixel to the next. On the other hand, a low-frequency image may be one that is relatively uniform in brightness or where intensity changes very slowly. Most images contain both high-frequency and low-frequency components. Let’s see by an example below

Clearly, in the above image, the zebra pattern has a high frequency as the intensity changes very rapidly from white to black. While the intensity changes very gradually in the sky thus it has low frequency.

It’s not hard to conclude that edges in an image represents high frequency because the intensity changes drastically across an edge.

Based on the frequency, we can classify the filters as

  • Low Pass Filters
  • High Pass Filters

Low Pass filters block high-frequency parts of an image and thus results in blurring or image smoothing. This is shown below

On the other hand, a high pass filter enhances high-frequency parts of an image (i.e. edges) and thus results in image sharpening.

In the next blog, we will discuss in detail different low pass and high pass filters, how to construct them and enhance an image. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Geometric Transformation of images using OpenCV-Python

Before reading, please refer to this blog for better understanding.

In this blog, we will discuss how to perform a geometric transformation using OpenCV-Python. In geometric transformation, we move the pixels of an image based on some mathematical formulae. This involves translation, rotation, scaling, and distortion (or undistortion!) of images. This is frequently used as a pre-processing step in many applications where the input is distorted while capturing like document scanning, matching temporal images in remote sensing and many more.

There are two basic steps in geometric transformation

  • Spatial Transformation: Calculating the spatial position of pixels in the transformed image.
  • Intensity Interpolation: Finding the intensity values at the newly calculated positions.

OpenCV has built-in functions to apply the different geometric transformations to images like translation, rotation, affine transformation, etc. You can find all the functions here: Geometric Transformations of Images

In this blog, we will learn how to change the apparent perspective of an image. This will make the image look more clear and easy to read. Below image summarizes what we want to do. See how easily we can read the words in the corrected image.

For perspective transformation, we need 4 points on the input image and corresponding points on the output image. The points should be selected counterclockwise. From these points, we will calculate the transformation matrix which when applied to the input image yields the corrected image. Let’s see the steps using OpenCV-Python

Steps:

  • Load the image
  • Convert the image to RGB so as to display via matplotlib
  • Select 4 points in the input image (counterclockwise, starting from the top left) by using matplotlib interactive window.
  • Specify the corresponding output coordinates.
  • Compute the perspective transform M using cv2.getPerspectiveTransform()
  • Apply the perspective transformation to the input image using cv2.warpPerspective() to obtain the corrected image.

Code:

By running the above code you will get an interactive matplotlib window popup. Now select any four points(better to select corner points) for the inputs. Then specify the corresponding output points.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Spatial Filtering

In the previous blogs, we discussed Intensity Transformation, a point processing technique for image enhancement. In this blog, we will discuss another image enhancement method known as Spatial Filtering, that transforms the intensity of a pixel according to the intensities of the neighboring pixels.

First let’s discuss what is a spatial filter?

The spatial filter is a window with some width and height that is usually much less than that of the image. Mostly 3×3, 5×5 or 7×7 size filters are used. The values in the filter are called coefficients or weights. There are other terms to call filters such as mask, kernel, template, or window. A 3×3 spatial filter is shown below

Now, let’s see the mechanism of Spatial Filtering.

The spatial filtering can be characterized as a ‘shift-and-multiply’ operation. First, we place the filter over a portion of an image. Then we multiply the filter weights (or coefficients) with the corresponding image pixel values, sum these up. The center image pixel value is then replaced with the result obtained. Then shift the filter to a new location and repeat the process again.

For the corner image pixels, we pad the image with 0’s. The whole process is shown below where a 3×3 filter is convolved with a 5×5 input image (blue color below) to produce a 7×7 output image.

This process is actually known as “correlation” but here, we refer to this as “convolution” operation. This should not be confused with mathematics convolution.

Note: The mathematics convolution is similar to correlation except that the mask is first flipped both horizontally and vertically.

Mathematically, the result of convolving a filter mask “w” of size mxn with an image “f” of size MxN is given by the expression

Here, we assume that filters are of odd size thus m=2a+1 and n=2b+1, where a and b are positive integers.

Let’s see how to do this using Python

Python Code

Again remember that this function does actually compute the correlation, not the convolution. If you need a real convolution, flip the kernel both horizontally and vertically and then apply the above function.

If you want the output image to be of the same size as that of the input, then you must change the padding as shown below

You can also do this using scipy or other libraries.

OpenCV

OpenCV has a builtin function cv2.filter2D() to convolve a kernel with an image. It’s arguments are

  • src: input image
  • ddepth: desired depth of the output image. If it is negative, it will be the same as that of the input image.
  • borderType: pixel extrapolation method.

This returns the output image of the same size and the same number of channels as the input image. Depending on the border type, you may get different outputs.

Hope you enjoy reading. In the next blog, we will learn how to do image smoothing or blurring by just changing the filter weights.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Histogram Backprojection

In this blog, we will discuss Histogram Backprojection, a technique that is used for image segmentation or finding objects of interest in an image. It was proposed by Michael J. Swain, Dana H. Ballard in their paper Indexing via color histograms, Third international conference on computer vision,1990.

This was one of the first works that use color to address the classic problem of Classification and Localisation in computer vision.

To understand this technique, knowledge of histograms (particularly 2-D histograms) is a must. If you haven’t encountered 2-d histograms yet, I suggest you to read What is a 2-D histogram?

Now, let’s see what is Histogram backprojection and how do we do it?

According to the authors, Histogram Backprojection answers the question

“Where are the colors in the image that belong to the object being looked for (the target)?”

So, this addresses the localization problem i.e. where is the object in an image. In this, we calculate the histogram model of a feature and then use it to find this feature in an image. To know why this is named as Histogram Backprojection, you need to know how this method works. So, let’s see how to do this

Suppose we want to find the green color in the target image. Let ‘roi’ be the image of the object we need to find and ‘target’ be the image where we are going to search that object.

Steps:

  • First, load the images, convert them into HSV and find the histograms as shown below

The 2-D histograms looks like this

This was expected, as roi image has shades of green so its histogram is mostly concentrated to the H and S values representing green. Similarly we can argue for the target image.

Now, what we want is to make ‘I’ look as similar to ‘M‘ as possible. Only then we will be able to extract the green color from the target image. So, let’s see how to do this.

One plausible solution is to divide “M” by “I”. This way the output will have values greater than 0, where both “M” and “I” are greater than 0. For all other cases, it will be either ‘0’ or ‘Nan’.

The output ‘R’ is shown below where cyan represents values greater than 0 (probably our roi), purple represents 0 and white represents Nan.

R, 2-D Histogram

Now the last thing to do is, find the pixels in the target image that corresponds to the cyan region shown above. In other words, we back project the 2-D histogram.

Because we know the “H” and “S” values for the cyan region (See R image above), we can easily find out the pixels with similar “H” and “S” values in the target image. Let’s see how to do this.

First, we will extract “H” and “S” channels from the target image.

For each pixel in the target image, using the “H” and “S” value for that pixel we will find the corresponding value in the 2-D histogram and save that value in ‘B’.

Note: “B” doesn’t contain “Nan” values. Remember, “Nan” occurs when both “M” and “I” equals to 0.

We keep the values between 0 and 1 so that the value can be treated as the probability of each pixel belonging to the target. After that, we resize “B”.

So, now we have created a new image “B” (size same as that of the target image) where every pixel value represents the corresponding probability of being the target. Brighter pixels are more probable of being the target. “B” is shown below

  • Now, the next step is just for fine-tuning this output. This varies from image to image.
  • Use thresholding to segment out the region and Overlay images using bitwise_and to produce the desired output.

The output looks like this

See how from a 2-D histogram we are able to extract the roi from the target image.

Backprojection in OpenCV

OpenCV provides an inbuilt function

cv2.calcBackProject( target_img, channels, roi_hist, ranges, scale )

  • target_img: image where you want to find the feature.
  • channels: The list of channels used to compute the back projection.
  • roi_hist: histogram of the feature you want to find.
  • ranges: histogram bin boundaries in each dimension.
  • scale: Optional scale factor for the output back projection.

This returns the probability image “B”.

So, we only need to calculate the roi histogram (M) and normalize it. No need of calculating “I” and “R”. This function directly output “B”.

After that apply all the fine-tuning steps that we did earlier.

That’s all about Histogram Backprojection. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time

Histogram Equalization

In the previous blog, we discussed contrast stretching, a linear contrast enhancement method. In this blog, we will learn Histogram Equalization which automatically increase the dynamic range based on the information available in the histogram of the input image.

Histogram Equalization, as the name suggests, stretches the histogram to fill the dynamic range and at the same time tries to keep the histogram uniform as shown below

Source: Wikipedia

By doing this, the resultant image will have an appearance of high contrast and exhibits a large variety of grey tones.

Mostly we will not be able to perfectly equalize the histogram. This is only possible if we assume continuous intensity values.. But in reality, intensity values are discrete thus perfectly flat histograms are rare in practical applications of the histogram equalization.

The transformation function used in this is

where ‘s’ and ‘r’ are the output and input pixel intensities respectively. ‘L’ is the maximum intensity value(for n bit image L = 2n). The probability of occurrence of the intensity level rj in the image is approximated by

Here, MN is the total number of pixels in the image and nj is the number of pixels that have intensity rj.

Now, let’s take an example to understand how to perform Histogram Equalisation using the above equations.

Suppose we have a 3-bit, 8×8 image whose pixel count and corresponding histogram is shown below

Now, using the above transformation function we calculate the equalized intensity values. For instance

Doing this for all values we get

Because the pixel values can only be integers so we round the last column(sk) to the nearest integer as shown below

So, the round column is the output pixel intensity. The last step is to replace the pixel values in the original image( rk column) with the round column values. For example, replace 0 with 0, 1 with 1, 2 with 1 and so on. This results in the histogram equalized image.

To plot the histogram, count the total pixels belonging to the rounded intensity values(See Round and nk column). For example, 2 pixels belonging to 0, 8 pixels for 1, 6 pixels for 2 and so on.

The initial and equalized histogram is shown below

Sometimes rounding to nearest integer yield non-zero minimum value. If we want the output to range from say [0,255] for 8-bit, then we need to apply stretching (as we did in Min-Max stretching) after rounding.

Histogram Equalization often produces unrealistic effects in photographs and reduce color depth(no. of unique grey levels) as shown in the example above(See pixel value 5). It works best when applied to images with much higher color depth.

Let’s see OpenCV function for Histogram Equalization

Its input is grayscale image and output is our histogram equalized image.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Add image to a live camera feed using OpenCV-Python

In this blog, we will learn how to add an image to a live camera feed using OpenCV-Python. Also known as Image Blending. In this we take the weighted sum of two images. These weights give a feeling of blending or transparency.

Images are added as per the equation below:

Since an image is a matrix so for the above equation to satisfy, both img1 and img2 must be of equal size.

OpenCV has a built-in function that does the exact same thing as shown below

The idea is that first, we will select which image we want to overlay (another image will serve as the background). Then we need to select the region in the background image where we want to put the overlay image. Add this selected region with the overlay image using the above equation. At last change the region in the background image with the result obtained in the previous line.

I hope you understand the idea. Now, let’s get started

Task:

Overlay a white square image on the live webcam feed according to different weights. Instead of manually giving weights, set two keys which on pressing increase or decrease the weights.

Steps:

  • Take an image which you want to overlay. Here, I have used a small white square created using numpy. You can use any.
  • Open the camera using cv2.VideoCapture()
  • Initialize the weights (alpha).
  • Until the camera is opened
    • Read the frame using cap.read()
    • Select the region in the frame where we want to add the image and add the images using cv2.addWeighted()
    • Change the region in the frame with the result obtained
    • Display the current value of weights using cv2.putText()
    • Display the image using cv2.imshow()
    • On pressing ‘a’ increase the value of alpha by 0.1 and decrease by the same amount on pressing ‘d’
    • Press ‘q’ to break

Code:

See the change in transparency by pressing keys ‘a’ and ‘d’. The output looks like this

You might encounter wrong values of alpha being displayed. This is because of Python’s floating point limitations.

Hope you enjoy reading. In the next blog, we will learn how to do the same for the non-rectangular region of interest.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.