Tag Archives: image processing

Optimum Global Thresholding using Otsu’s Method

In the previous blog, we discussed global thresholding and how to find the global threshold using the iterative approach. In this blog, we will discuss Otsu’s method, named after Nobuyuki Otsu, that automatically finds the global threshold. So, let’s discuss this method in detail.

Note: This method assumes that the image histogram is bimodal and a reasonable contrast ratio exists between the background and the region of interest.

In simple terms, Otsu’s method tries to find a threshold value which minimizes the weighted within-class variance. Since Variance is the spread of the distribution about the mean. Thus, minimizing the within-class variance will tend to make the classes compact.

Let’s say we threshold a histogram at a value “t”. This produces two regions – left and right of “t” whose variance is given by σ20 and σ21. Then the weighted within-class variance is given by

where w0(t) and w1(t) are the weights given to each class. Weights are total pixels in a thresholded region (left or right) divided by the total image pixels. Let’s take a simple example to understand how to calculate these.

Suppose we have the following histogram and we want to find the weighted within-class variance corresponding to threshold value 1.

Below are the weights and the variances calculated for left and the right regions obtained after thresholding at value 1.

Similarly, we will iterate over all the possible threshold values, calculate the weighted within-class variance for each of the thresholds. The optimum threshold will be the one with the minimum within-class variance.

Now, let’s see how to do this using python.

The image histogram is shown below

Now, let’s calculate the within-class variance using the steps which we discussed earlier.

The gif below shows how the within-class variance (blue dots) varies with the threshold value for the above histogram. The optimum threshold value is the one where the within-class variance is minimum.

OpenCV also provides a builtin function to calculate the threshold using this method.

OpenCV

You just need to pass an extra flag, cv2.THRESH_OTSU in the cv2.threshold() function which we discussed in the previous blog. The optimum threshold value will be returned by this along with the thresholded image. Let’s see how to use this.

A Faster Approach

We all know that minimizing within-class variance is equivalent to maximizing between-class variance. This maximization operation can be implemented recursively and is faster than the earlier method. The expression for between-class variance is given by

Below are the steps to calculate recursively between-class variance.

  1. Calculate the histogram of the image.
  2. Set up weights and means corresponding to the “0” threshold value.
  3. Loop through all the threshold values
    1. Update the weights and the mean
    2. Calculate the between-class variance
  4. The optimum threshold will be the one with the max variance.

Below is the code in Python that implements the above steps.

This is how you can implement otsu’s method recursively if you consider maximizing between-class variance. Now, let’s discuss what are the limitations of this method.

Limitations

Otsu’s method is only guaranteed to work when

  • The histogram should be bimodal.
  • Reasonable contrast ratio exists between the background and the roi.
  • Uniform lighting conditions are there.
  • Image is not affected by noise.
  • Size of the background and the roi should be comparable.

There are many modifications done to the original Otsu’s algorithm to address these limitations such as two-dimensional Otsu’s method etc. We will discuss some of these modifications in the following blogs.

In the following blogs, we will also discuss how to counter these limitations so as to get satisfactory results with otsu’s method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Improving Global Thresholding

In the previous blog, we discussed otsu’s method for automatic image thresholding. Then we also discussed the limitations of the otsu’s method. In this blog, we will discuss how to handle these limitations so as to produce satisfactory thresholding results. So, let’s get started.

Case-1: When the noise is present in the image

If the noise is present in the image, then this tends to change the modality of the histogram. The sharp valleys between the peaks of the bimodal histogram start degrading. In that case, the otsu’s method or any other global thresholding method will fail. So, in order to find the global threshold, one should first remove the noise using any smoothing filters like Gaussian, etc. and then apply any automatic thresholding method like otsu, etc.

Case-2: When the object area is small compared to the background area

In this case, the image histogram will be dominated by a large background area. This will increase the probability of any pixel belonging to the background. So, the histogram will no longer exhibit bimodality and thus otsu will result in segmentation error. To prevent this, one should only consider pixels that lie on or near the edges between the objects and the background. Doing so will result in an image histogram with peaks of approximately the same size. Then we can apply any automatic thresholding method like otsu, etc. Below are the steps to implement the above procedure.

  • Calculate the edge image using any high pass filter like Sobel, Laplacian, etc.
  • Select any threshold value (T).
  • Threshold the above edge image to produce a binary mask.
  • Apply the mask image on the input image using any bitwise operations or any other method.
  • This results in only those pixels where the mask image was white.
  • Compute the histogram of only those pixels
  • Finally, apply any automatic global thresholding method like otsu, etc.

Case-3: When the image is taken under non-uniform illumination conditions

In this case, the histogram no longer remains bimodal and thus we will not be able to segment the image satisfactorily. One of the simplest approaches is to subdivide the image into non-overlapping images/rectangles. The size of these rectangles is chosen such that the illumination is nearly constant in each of these rectangles. Then we will apply any global thresholding technique like otsu for each of these rectangles.

The above procedure only works when the size of the object and the background are comparable in the rectangle. This is quite intuitive as only then we will have a bimodal histogram. Taking care of the background and the object sizes in each rectangle is a tedious task.

So, in the next blog, we will discuss adaptive thresholding that works pretty well for the above conditions. That’s all for this blog. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Global Thresholding

In the previous blog, we discussed image thresholding and when to use this for image segmentation. We also learned that thresholding can be global or adaptive depending upon how the threshold value is selected.

In this blog, we will discuss

  • global thresholding
  • OpenCV function for global thresholding
  • How to choose threshold value using the iterative algorithm

In global thresholding, each pixel value in the image is compared with a single (global) threshold value. Below is the code for this.

Here, we assign a value of “val_high” to all the pixels greater than the threshold otherwise “val_low“. OpenCV also provides a builtin function for thresholding the image. So, let’s take a look at that function.

OpenCV

This function returns the thresholded image(dst) and the threshold value(retval). Its arguments are

  • src: input greyscale image (8-bit or 32-bit floating point)
  • thresh: global threshold value
  • type: Different types that decide “val_high” and “val_low“. In other words, these types decide what value to assign for pixels greater than and less than the threshold. Below figure shows different thresholding types available.
  • maxval: maximum value to be used with THRESH_BINARY and THRESH_BINARY_INV. Check the below image.

To specify the thresholding type, write “cv2.” as the prefix. For instance, write cv2.THRESH_BINARY if you want to use this type. Let’s take an example

Similarly, you can apply other thresholding types to check how they work. Till now we discussed how to threshold an image using a global threshold value. But we didn’t discuss how to get this threshold value. So, in the next section, let’s discuss this.

How to choose the threshold value?

As already discussed, that global thresholding is a suitable approach only when intensity distributions of the background and the ROI are sufficiently distinct. In other words, there is a clear valley between the peaks of the histogram. We can easily select the threshold value in that situation. But what if we have a number of images. In that case, we don’t manually want to first check the image histogram and then deciding the threshold value. We want something that can automatically estimate the threshold value for each image. Below is the algorithm that can be used for this purpose.

Source: Wikipedia

Below is the code for the above algorithm.

Now, let’s take an example to check how’s this working.

That’s all for this blog. In the next blog, we will discuss how to perform optimum global thresholding using Otsu’s method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Thresholding

Image Segmentation is the process of subdividing an image into its constituent regions or objects. In many computer vision applications, image segmentation is very useful to detect the region of interest. For instance, in medical imaging where we have to locate tumors, or in object detection like self-driving cars have to detect pedestrians, traffic signals, etc or for video surveillance, etc. There are a number of methods available to perform image segmentation. For instance, thresholding, clustering methods, graph partitioning methods, and convolutional methods to mention a few.

In this blog, we will discuss Image Thresholding which is one of the simplest methods for image segmentation. In this, we partition the images directly into regions based on the intensity values. So, let’s discuss image thresholding in greater detail.

Concept

If the pixel value is greater than a threshold value, it is assigned one value (maybe white), else it is assigned another value (maybe black).

In other words, if f(x,y) is the input image then the segmented image g(x,y) is given by

If the threshold value T remains constant over the entire image, then this is known as global thresholding. When the value of T changes over the entire image or depends upon the pixel neighborhood, then this is known as adaptive thresholding. We will cover both these types in greater detail in the following blogs.

Applicability Condition

Thresholding is only guaranteed to work when a good contrast ratio between the region of interest and the background exists. Otherwise, the thresholding will not be able to fully detect the region of interest. Let’s understand this by an example.

Suppose we have two images from which we want to segment the square region (our region of interest) from the background.

Let’s plot the histogram of these two images.

Clearly as expected for “A“, the histogram is showing two peaks corresponding to the square and the background. The separation between the peaks shows that the background and ROI have a good contrast ratio. By choosing a threshold value between the peaks, we will be able to segment out the ROI. While for “B”, the intensity distribution of the ROI and the background is not that distinct. Thus we may not be able to fully segment the ROI.

Thresholded images are shown below (How to choose a threshold value will be discussed in the next blog).

So, always plot the image histogram to check the contrast ratio between the background and the ROI. Only if the contrast ratio is good, choose the thresholding method for image segmentation. Otherwise, look for other methods.

In the next blog, we will discuss global thresholding and how to choose the threshold value using the iterative method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Laplacian of Gaussian (LoG)

In the previous blog, we discuss various first-order derivative filters. In this blog, we will discuss the Laplacian of Gaussian (LoG), a second-order derivative filter. So, let’s get started

Mathematically, the Laplacian is defined as

Unlike first-order filters that detect the edges based on local maxima or minima, Laplacian detects the edges at zero crossings i.e. where the value changes from negative to positive and vice-versa.

Let’s obtain kernels for Laplacian similar to how we obtained kernels using finite difference approximations for the first-order derivative.

Adding these two kernels together we obtain the Laplacian kernel as shown below

This is called a negative Laplacian because the central peak is negative. Other variants of Laplacian can be obtained by weighing the pixels in the diagonal directions also. Make sure that the sum of all kernel elements is zero so that the filter gives zero response in the homogeneous regions.

Let’s now discuss some properties of the Laplacian

  • Unlike first-order that requires two masks for finding edges, Laplacian uses 1 mask but the edge orientation information is lost in Laplacian.
  • Laplacian gives better edge localization as compared to first-order.
  • Unlike first-order, Laplacian is an isotropic filter i.e. it produces a uniform edge magnitude for all directions.
  • Similar to first-order, Laplacian is also very sensitive to noise

To reduce the noise effect, image is first smoothed with a Gaussian filter and then we find the zero crossings using Laplacian. This two-step process is called the Laplacian of Gaussian (LoG) operation.

But this can also be performed in one step. Instead of first smoothing an image with a Gaussian kernel and then taking its Laplace, we can obtain the Laplacian of the Gaussian kernel and then convolve it with the image. This is shown below where f is the image and g is the Gaussian kernel.

Now, let’s see how to obtain LoG kernel. Mathematically, LoG can be written as

The LoG kernel weights can be sampled from the above equation for a given standard deviation, just as we did in Gaussian Blurring. Just convolve the kernel with the image to obtain the desired result, as easy as that.

Select the size of the Gaussian kernel carefully. If LoG is used with small Gaussian kernel, the result can be noisy. If you use a large Gaussian kernel, you may get poor edge localization.

Now, let’s see how to do this using OpenCV-Python

OpenCV-Python

OpenCV provides a builtin function that calculates the Laplacian of an image. You can find it here. Below is the basic syntax of what this function looks like

Steps for LoG:

  • Apply LoG on the image. This can be done in two ways:
    • First, apply Gaussian and then Laplacian or
    • Convolve the image with LoG kernel directly
  • Find the zero crossings in the image
  • Threshold the zero crossings to extract only the strong edges.

Let’s understand each step through code

Since zero crossings is a change from negative to positive and vice-versa, so an approximate way is to clip the negative values to find the zero crossings.

Another way is to check each pixel for zero crossing as shown below

Depending upon the image you may need to apply thresholding and median blurring to suppress the noise.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

First-order Derivative kernels for Edge Detection

In the previous blog, we briefly discussed that an edge can be detected by

  • First derivative (local maximum or minimum)
  • Second derivative (zero crossings)

In this blog, let’s discuss in detail how we can detect edges using the first order derivative.

Remember that derivatives only exists for continuous functions but the image is a discrete 2D light intensity function. Thus in the last blog, we approximated the image gradients using finite approximation as

For the edge detection case, we will prefer the central difference as shown above. Using this central difference, we can obtain the derivative filter in x and y directions as shown below

Here, we have assumed that the x-coordinate is increasing in the “right”-direction, and y-coordinate in the “down”-direction. By weighting these x and y derivatives, we can obtain different edge detection filters. Let’s see how

1. Sobel Operator

This is obtained by multiplying the x, and y-derivative filters obtained above with some smoothing filter(1D) in the other direction. For example, a 3×3 Sobel-x and Sobel-y filter can be obtained as

As we know that the Gaussian filter is used for blurring thus, the Sobel operator computes the gradient with smoothing. Thus this is less sensitive to noise. Because of separability property of the kernel, the Sobel operator is computationally efficient.

Note: Using larger Sobel kernels leads to more edge blurring, thus some form of edge thinning must be applied to counter this.

When we convolve these Sobel operators with the image, they estimate the gradients in the x, and y-directions(say Gx and Gy). For each point, we can calculate the gradient magnitude and direction as

We can easily infer that the edge direction or the angle will be positive for the transition from dark to white and negative otherwise. Now, let’s see how to do this using OpenCV-Python

OpenCV has a builtin function that calculates the image derivatives using the Sobel operator. Its basic syntax is shown below. You can read more about it here.

Note: Earlier we have assumed that white to black transition yields negative values. Thus if our output datatype is cv2.CV_8U or np.uint8, this will make all negative values 0. To prevent this, we specify the output datatype to some higher forms, like cv2.CV_16S, cv2.CV_64F etc, take its absolute value and then convert back to cv2.CV_8U.

The output is shown below

2. Scharr Operator

This operator tries to achieve the perfect rotational symmetry. The 3×3 Scharr filter is shown below

OpenCV provides a builtin function for this

3. Prewitt Operator

In this, the x, and y-derivative filters are weighted with the standard averaging filter as shown below

Here, we discussed only the most common filters. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Canny Edge Detector

In this blog, we will discuss one of the most popular algorithms for edge detection known as Canny Edge detection. It was developed by John F. Canny in 1986. It is a multi-stage algorithm that provides good and reliable detection. So, let’s discuss the main steps used in the Canny Edge detection algorithm using OpenCV-Python.

1. Noise Reduction

An edge detector is a high pass filter that enhances the high-frequency component and suppresses the low ones. Since both edges and noise are high-frequency components, the edge detectors tend to amplify the noise. To prevent this, we smooth the image with a low-pass filter. Canny uses a Gaussian filter for this.

Below is the code for this using OpenCV-Python

A larger filter reduces noise but worsens edge localization and vice-versa. Generally, 5×5 is a good choice but this may vary from image to image.

2. Finding Intensity Gradient of the Image

Next step is to find the edges using a Sobel operator. Sobel finds the gradients in both horizontal(Gx) and vertical(Gy) direction. Since edges are perpendicular to the gradient direction, using these gradients we can find the edge gradient and direction for each pixel as:

Below is the code for this using OpenCV-Python (Here, I’ve converted everything to 8-bit, it’s optional you can use any output datatype)

Clearly, we can see that the edges are still quite blurred or thick. Remember that an edge detector should output only one accurate response corresponding to the edge. Thus we need to thin the edges or in other words find the largest edge. This is done using Non-max Suppression.

3. Non-Max Suppression

This is an edge thinning technique. In this, for each pixel, we check if it is a local maximum in its neighborhood in the direction of gradient or not. If it is a local maximum it is retained as an edge pixel, otherwise suppressed.

For each pixel, the neighboring pixels are located in horizontal, vertical, and diagonal directions (0°, 45°, 90°, and 135°). Thus we need to round off the gradient direction at every pixel to one of these directions as shown below.

After rounding, we will compare every pixel value against the two neighboring pixels in the gradient direction. If that pixel is a local maximum, it is retained as an edge pixel otherwise suppressed. This way only the largest responses will be left.

Let’s see an example

Suppose for a pixel ‘A’, the gradient direction comes out to be 17 degrees. Since 17 is nearer to 0, we will round it to 0 degrees. Then we select neighboring pixels in the rounded gradient direction (See B and C in below figure). If the intensity value of A is greater than that of B and C, it is retained as an edge pixel otherwise suppressed.

Let’s see how to do this using OpenCV-Python

Clearly, we can see that the edges are thinned but some edges are more bright than others. The brighter ones can be considered as strong edges but the lighter ones can actually be edges or they can be because of noise.

4. Hysteresis Thresholding

Non-max suppression outputs a more accurate representation of real edges in an image. But you can see that some edges are more bright than others. The brighter ones can be considered as strong edges but the lighter ones can actually be edges or they can be because of noise. To solve the problem of “which edges are really edges and which are not” Canny uses the Hysteresis thresholding. In this, we set two thresholds ‘High’ and ‘Low’.

  • Any edges with intensity greater than ‘High’ are the sure edges.
  • Any edges with intensity less than ‘Low’ are sure to be non-edges.
  • The edges between ‘High’ and ‘Low’ thresholds are classified as edges only if they are connected to a sure edge otherwise discarded.

Let’s take an example to understand

Here, A and B are sure-edges as they are above ‘High’ threshold. Similarly, D is a sure non-edge. Both ‘E’ and ‘C’ are weak edges but since ‘C’ is connected to ‘B’ which is a sure edge, ‘C’ is also considered as a strong edge. Using the same logic ‘E’ is discarded. This way we will get only the strong edges in the image.

This is based on the assumption that the edges are long lines.

Below is the code using OpenCV-Python.

First set the thresholds and classify edges into strong, weak or non-edges.

For weak edges, if it is connected to a sure edge it will be considered as an edge otherwise suppressed.

OpenCV-Python

OpenCV provides a builtin function for performing Canny Edge detection

Let’s take an example

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Unsharp Masking and Highboost filtering

In this blog, we will learn how we can sharpen an image or perform edge enhancement using a smoothing filter. Let’s see how this is done

  • First, we blur the image. We know by smoothing an image we suppress most of the high-frequency components.
  • Then, we subtract this smoothed image from the original image(the resulting difference is known as a mask). Thus, the output image will have most of the high-frequency components that are blocked by the smoothing filter.
  • Adding this mask back to the original will enhance the high-frequency components.

Because we are using a blurred or unsharp image to create a mask this technique is known as Unsharp Masking.

Thus, unsharp masking first produces a mask m(x,y) as

where, f(x,y) is the original image and fb(x,y) is the blurred version of the original image.

Then this mask is added back to the original image which results in enhancing the high-frequency components.

where k specifies what portion of the mask to be added. When k= 1 this is known as Unsharp masking. For k>1 we call this as high-boost filtering because we are boosting the high-frequency components by giving more weight to the masked (edge) image.

We can also write the above two equations into one as the weighted average of the original and the blurred image.

Note: Instead of subtracting the blurred image from the original, we can directly use a negative Laplacian filter to obtain the mask.

If the image contains noise, this method will not produce satisfactory results, like most of the other sharpening filters.

Let’s see how to do this using OpenCV-Python

OpenCV-Python

Since in the last equation we described unsharp masking as the weighted average of the original and the input image, we will simply use OpenCV cv2.addWeighted() function.

The output is shown below

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Difference of Gaussians (DoG)

In the previous blog, we discussed Gaussian Blurring that uses Gaussian kernels for image smoothing. This is a low pass filtering technique that blocks high frequencies (like edges, noise, etc.). In this blog, we will see how we can use this Gaussian Blurring to highlight certain high-frequency parts in an image. Isn’t that interesting? So, let’s get started.

In Gaussian Blurring, we discussed how the standard deviation of the Gaussian affects the degree of smoothing. Roughly speaking, larger the standard deviation more will be the blurring or in other words more high frequency components will be suppressed.

Thus if we take 2 Gaussian kernels with different standard deviations, apply separately on the same image and subtract their corresponding responses, we will get an output that highlights certain high-frequency components based on the standard deviations used.

The logic is by blurring we remove some high-frequency components that represent noise, and by subtracting we remove some low-frequency components that correspond to the homogeneous areas in the image. All the remaining frequency components are assumed to be associated with the edges in the images. Thus, the Difference of Gaussian acts like a bandpass filter. Let’s take an example to understand this.

Suppose we have an image as shown below

Suppose we have 2 Gaussian kernels with standard deviation (σ1 > σ2). The kernel (with σ1), when convolved with an image, will blur the high-frequency components more as compared to the other kernel. Subtracting these, we can recover the information that lies between the frequency range which is not suppressed or blurred.

Now, that we saw how this works, let’s also discuss where this is useful(its pros and cons) as compared to other edge detection methods we have discussed.

All the edge detection kernels which we discussed till now are quite good in edge detection but one downside is that they are highly susceptible to noise. Thus if the image contains a high degree of noise, Difference of Gaussian is the way to go. This is because we are actually doing blurring which reduces the effect of noise to a great extent.

One downside of this method is that the edges are not enhanced much as compared to other methods. The output image formed has lower contrast.

OpenCV-Python

Let’s see how to do this using OpenCV-Python

Note: Using the linear separable property of the Gaussian kernel we can speed up the entire algorithm.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Image Gradients

In the previous blogs, we discussed different smoothing filters. Before moving forward, let’s first discuss Image Gradients which will be useful in edge detection, robust feature and texture matching. So, let’s first recall what a gradient is.

In mathematics, the term gradient of a function means how a function is changing wrt. its arguments or independent variables. The gradient term is more frequently used for multi-variable functions. For a single variable function, we refer to this as the slope.

The gradient of an N-variable function at each point is an N-D vector with the components given by the derivatives in the N-directions. e.g. for a 3-variable function (f(x,y,z)), the gradient, if it exists, is given by

Thus, the gradient provides two pieces of information – magnitude and direction. The direction of the gradient tells us the direction of greatest increase while the magnitude represents the rate of increase in that direction.

Because gradients are defined only for continuous functions and Image is a 2-d discrete function (F(x,y)). Thus we need to approximate the gradients and we do this using Finite differences. In this instead of h approaching 0 we assume h to be a fixed (non-zero) value.

Three forms of finite differences are commonly used: forward, backward and central.

But for calculating Image Gradients, we use the central difference to approximate gradients in x and y directions. Below example shows how to calculate the central difference in the x-direction for 200.

In the next blog, we will discuss how to derive different kernels such as Sobel, Prewitt, etc from this central difference formulae and then using convolution to approximate the image gradients.

Thus, at each image point, the gradient vector points in the direction of largest possible intensity increase, and the magnitude corresponds to the rate of change in that direction. Thus for an image f(x,y), the gradient direction and magnitude is given by

Thus in simple words, image gradient in x-direction measures the horizontal change in intensity while the gradient in y measures the vertical change in intensity.

Since edges are an abrupt change in the intensity values thus the largest gradient values will occur across an edge (neglecting noise) in an image. Thus, the x-gradient will find the vertical edges while y-gradient will highlight the horizontal edges as shown below.

Thus, we may conclude that edges are perpendicular to the gradient direction(largest). That’s why gradients are used in edge detection.

In the next blog, we will discuss different edge detection filters. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.