Tag Archives: image processing

Optimum Global Thresholding using Otsu’s Method

In the previous blog, we discussed global thresholding and how to find the global threshold using the iterative approach. In this blog, we will discuss Otsu’s method, named after Nobuyuki Otsu, that automatically finds the global threshold. So, let’s discuss this method in detail.

Note: This method assumes that the image histogram is bimodal and a reasonable contrast ratio exists between the background and the region of interest.

In simple terms, Otsu’s method tries to find a threshold value which minimizes the weighted within-class variance. Since Variance is the spread of the distribution about the mean. Thus, minimizing the within-class variance will tend to make the classes compact.

Let’s say we threshold a histogram at a value “t”. This produces two regions – left and right of “t” whose variance is given by σ²₀ and σ²₁. Then the weighted within-class variance is given by

where w₀(t) and w₁(t) are the weights given to each class. Weights are total pixels in a thresholded region (left or right) divided by the total image pixels. Let’s take a simple example to understand how to calculate these.

Suppose we have the following histogram and we want to find the weighted within-class variance corresponding to threshold value 1.

Below are the weights and the variances calculated for left and the right regions obtained after thresholding at value 1.

Similarly, we will iterate over all the possible threshold values, calculate the weighted within-class variance for each of the thresholds. The optimum threshold will be the one with the minimum within-class variance.

Now, let’s see how to do this using python.

import cv2
import matplotlib.pyplot as plt
import numpy as np

# Create a sample image
np.random.seed(7)
img = np.random.normal(40,10,size=(500,500)).astype('uint8')
img[img>100]=40
img[100:400,100:400] = np.random.normal(150,20,size=(300,300)).astype('uint8')

# plot the histogram
hist = plt.hist(img.ravel(),256,[0,256])
plt.show()

import cv2

import matplotlib.pyplot as plt

import numpy as np

# Create a sample image

np.random.seed(7)

img = np.random.normal(40,10,size=(500,500)).astype('uint8')

img[img>100]=40

img[100:400,100:400] = np.random.normal(150,20,size=(300,300)).astype('uint8')

# plot the histogram

hist = plt.hist(img.ravel(),256,[0,256])

plt.show()

The image histogram is shown below

Now, let’s calculate the within-class variance using the steps which we discussed earlier.

# Set minimum value to infinity
final_min = np.inf
# total pixels in an image
total = np.sum(hist[0])
for i in range(256):
    # Split regions based on threshold
    left, right = np.hsplit(hist[0],[i])
    # Splt intensity values based on threshold
    left_bins, right_bins = np.hsplit(hist[1],[i])
    # Only perform thresholding if neither side empty
    if np.sum(left) !=0 and np.sum(right) !=0:
        # Calculate weights on left and right sides
        w_0 = np.sum(left)/total
        w_1 = np.sum(right)/total
        # Calculate the mean for both sides
        mean_0 = np.dot(left,left_bins)/np.sum(left)
        mean_1 = np.dot(right,right_bins[:-1])/np.sum(right)  # right_bins[:-1] because matplotlib has uses 1 bin extra
        # Calculate variance of both sides
        var_0 = np.dot(((left_bins-mean_0)**2),left)/np.sum(left)
        var_1 = np.dot(((right_bins[:-1]-mean_1)**2),right)/np.sum(right)
        # Calculate final within class variance
        final = w_0*var_0 + w_1*var_1
        # if variance minimum, update it
        if final<final_min:
            final_min = final
            thresh = i
        
print(thresh) # 95

# Set minimum value to infinity

final_min = np.inf

# total pixels in an image

total = np.sum(hist[0])

for i in range(256):

# Split regions based on threshold

left, right = np.hsplit(hist[0],[i])

# Splt intensity values based on threshold

left_bins, right_bins = np.hsplit(hist[1],[i])

# Only perform thresholding if neither side empty

if np.sum(left) !=0 and np.sum(right) !=0:

# Calculate weights on left and right sides

w_0 = np.sum(left)/total

w_1 = np.sum(right)/total

# Calculate the mean for both sides

mean_0 = np.dot(left,left_bins)/np.sum(left)

mean_1 = np.dot(right,right_bins[:-1])/np.sum(right) # right_bins[:-1] because matplotlib has uses 1 bin extra

# Calculate variance of both sides

var_0 = np.dot(((left_bins-mean_0)**2),left)/np.sum(left)

var_1 = np.dot(((right_bins[:-1]-mean_1)**2),right)/np.sum(right)

# Calculate final within class variance

final = w_0*var_0 + w_1*var_1

# if variance minimum, update it

if final<final_min:

final_min = final

thresh = i

print(thresh) # 95

The gif below shows how the within-class variance (blue dots) varies with the threshold value for the above histogram. The optimum threshold value is the one where the within-class variance is minimum.

OpenCV also provides a builtin function to calculate the threshold using this method.

OpenCV

You just need to pass an extra flag, cv2.THRESH_OTSU in the cv2.threshold() function which we discussed in the previous blog. The optimum threshold value will be returned by this along with the thresholded image. Let’s see how to use this.

gray = cv2.imread('kang.jpg',0)
retval, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

1 2	gray = cv2.imread('kang.jpg',0) retval, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

A Faster Approach

We all know that minimizing within-class variance is equivalent to maximizing between-class variance. This maximization operation can be implemented recursively and is faster than the earlier method. The expression for between-class variance is given by

Below are the steps to calculate recursively between-class variance.

Calculate the histogram of the image.
Set up weights and means corresponding to the “0” threshold value.
Loop through all the threshold values
1. Update the weights and the mean
2. Calculate the between-class variance
The optimum threshold will be the one with the max variance.

Below is the code in Python that implements the above steps.

# Calculate the histogram
hist = plt.hist(img1.ravel(),256,[0,256])
# Total pixels in the image
total = np.sum(hist[0])
# calculate the initial weights and the means
left, right = np.hsplit(hist[0],[0])
left_bins, right_bins = np.hsplit(hist[1],[0])
# left weights
w_0 = 0.0
# Right weights
w_1 = np.sum(right)/total
# Left mean
mean_0 = 0.0
weighted_sum_0 = 0.0
# Right mean
weighted_sum_1 = np.dot(right,right_bins[:-1])
mean_1 = weighted_sum_1/np.sum(right)
def recursive_otsu1(hist, w_0=w_0, w_1=w_1, weighted_sum_0=weighted_sum_0, weighted_sum_1=weighted_sum_1, thres=1, fn_max=-np.inf, thresh=0, total=total):
    if thres<=255:
        # To pass the division by zero warning
        if np.sum(hist[0][:thres+1]) !=0 and np.sum(hist[0][thres+1:]) !=0:
            # Update the weights
            w_0 += hist[0][thres]/total
            w_1 -= hist[0][thres]/total
            # Update the mean
            weighted_sum_0 += (hist[0][thres]*hist[1][thres])
            mean_0 = weighted_sum_0/np.sum(hist[0][:thres+1])
            weighted_sum_1 -= (hist[0][thres]*hist[1][thres])
            if thres == 255:
                mean_1 = 0.0
            else:
                mean_1 = weighted_sum_1/np.sum(hist[0][thres+1:])
            # Calculate the between-class variance
            out = w_0*w_1*((mean_0-mean_1)**2)
            # # if variance maximum, update it
            if out>fn_max:
                fn_max = out
                thresh = thres
        return recursive_otsu1(hist, w_0=w_0, w_1=w_1, weighted_sum_0=weighted_sum_0, weighted_sum_1=weighted_sum_1, thres=thres+1, fn_max=fn_max, thresh=thresh, total=total)
    # Stopping condition
    else:
        return fn_max,thresh
    
    
# Check the results
var_value, thresh_value = recursive_otsu1(hist, w_0=w_0, w_1=w_1, weighted_sum_0=weighted_sum_0, weighted_sum_1=weighted_sum_1, thres=1, fn_max=-np.inf, thresh=0, total=total)
print(var_value, thresh_value)

# Calculate the histogram

hist = plt.hist(img1.ravel(),256,[0,256])

# Total pixels in the image

total = np.sum(hist[0])

# calculate the initial weights and the means

left, right = np.hsplit(hist[0],[0])

left_bins, right_bins = np.hsplit(hist[1],[0])

# left weights

w_0 = 0.0

# Right weights

w_1 = np.sum(right)/total

# Left mean

mean_0 = 0.0

weighted_sum_0 = 0.0

# Right mean

weighted_sum_1 = np.dot(right,right_bins[:-1])

mean_1 = weighted_sum_1/np.sum(right)

def recursive_otsu1(hist, w_0=w_0, w_1=w_1, weighted_sum_0=weighted_sum_0, weighted_sum_1=weighted_sum_1, thres=1, fn_max=-np.inf, thresh=0, total=total):

if thres<=255:

# To pass the division by zero warning

if np.sum(hist[0][:thres+1]) !=0 and np.sum(hist[0][thres+1:]) !=0:

# Update the weights

w_0 += hist[0][thres]/total

w_1 -= hist[0][thres]/total

# Update the mean

weighted_sum_0 += (hist[0][thres]*hist[1][thres])

mean_0 = weighted_sum_0/np.sum(hist[0][:thres+1])

weighted_sum_1 -= (hist[0][thres]*hist[1][thres])

if thres == 255:

mean_1 = 0.0

else:

mean_1 = weighted_sum_1/np.sum(hist[0][thres+1:])

# Calculate the between-class variance

out = w_0*w_1*((mean_0-mean_1)**2)

# # if variance maximum, update it

if out>fn_max:

fn_max = out

thresh = thres

return recursive_otsu1(hist, w_0=w_0, w_1=w_1, weighted_sum_0=weighted_sum_0, weighted_sum_1=weighted_sum_1, thres=thres+1, fn_max=fn_max, thresh=thresh, total=total)

# Stopping condition

else:

return fn_max,thresh

# Check the results

var_value, thresh_value = recursive_otsu1(hist, w_0=w_0, w_1=w_1, weighted_sum_0=weighted_sum_0, weighted_sum_1=weighted_sum_1, thres=1, fn_max=-np.inf, thresh=0, total=total)

print(var_value, thresh_value)

This is how you can implement otsu’s method recursively if you consider maximizing between-class variance. Now, let’s discuss what are the limitations of this method.

Limitations

Otsu’s method is only guaranteed to work when

The histogram should be bimodal.
Reasonable contrast ratio exists between the background and the roi.
Uniform lighting conditions are there.
Image is not affected by noise.
Size of the background and the roi should be comparable.

There are many modifications done to the original Otsu’s algorithm to address these limitations such as two-dimensional Otsu’s method etc. We will discuss some of these modifications in the following blogs.

In the following blogs, we will also discuss how to counter these limitations so as to get satisfactory results with otsu’s method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Improving Global Thresholding

Leave a reply

In the previous blog, we discussed otsu’s method for automatic image thresholding. Then we also discussed the limitations of the otsu’s method. In this blog, we will discuss how to handle these limitations so as to produce satisfactory thresholding results. So, let’s get started.

Case-1: When the noise is present in the image

If the noise is present in the image, then this tends to change the modality of the histogram. The sharp valleys between the peaks of the bimodal histogram start degrading. In that case, the otsu’s method or any other global thresholding method will fail. So, in order to find the global threshold, one should first remove the noise using any smoothing filters like Gaussian, etc. and then apply any automatic thresholding method like otsu, etc.

Case-2: When the object area is small compared to the background area

In this case, the image histogram will be dominated by a large background area. This will increase the probability of any pixel belonging to the background. So, the histogram will no longer exhibit bimodality and thus otsu will result in segmentation error. To prevent this, one should only consider pixels that lie on or near the edges between the objects and the background. Doing so will result in an image histogram with peaks of approximately the same size. Then we can apply any automatic thresholding method like otsu, etc. Below are the steps to implement the above procedure.

Calculate the edge image using any high pass filter like Sobel, Laplacian, etc.
Select any threshold value (T).
Threshold the above edge image to produce a binary mask.
Apply the mask image on the input image using any bitwise operations or any other method.
This results in only those pixels where the mask image was white.
Compute the histogram of only those pixels
Finally, apply any automatic global thresholding method like otsu, etc.

Case-3: When the image is taken under non-uniform illumination conditions

In this case, the histogram no longer remains bimodal and thus we will not be able to segment the image satisfactorily. One of the simplest approaches is to subdivide the image into non-overlapping images/rectangles. The size of these rectangles is chosen such that the illumination is nearly constant in each of these rectangles. Then we will apply any global thresholding technique like otsu for each of these rectangles.

The above procedure only works when the size of the object and the background are comparable in the rectangle. This is quite intuitive as only then we will have a bimodal histogram. Taking care of the background and the object sizes in each rectangle is a tedious task.

So, in the next blog, we will discuss adaptive thresholding that works pretty well for the above conditions. That’s all for this blog. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Global Thresholding

Leave a reply

In the previous blog, we discussed image thresholding and when to use this for image segmentation. We also learned that thresholding can be global or adaptive depending upon how the threshold value is selected.

In this blog, we will discuss

global thresholding
OpenCV function for global thresholding
How to choose threshold value using the iterative algorithm

In global thresholding, each pixel value in the image is compared with a single (global) threshold value. Below is the code for this.

def global_threshold(image, thres_value, val_high, val_low):
    img = image.copy()
    for i in range(image.shape[0]):
        for j in range(image.shape[1]):
            if image[i,j] > thres_value:
                img[i,j] = val_high
            else:
                img[i,j] = val_low
    return img

def global_threshold(image, thres_value, val_high, val_low):

img = image.copy()

for i in range(image.shape[0]):

for j in range(image.shape[1]):

if image[i,j] > thres_value:

img[i,j] = val_high

else:

img[i,j] = val_low

return img

Here, we assign a value of “val_high” to all the pixels greater than the threshold otherwise “val_low“. OpenCV also provides a builtin function for thresholding the image. So, let’s take a look at that function.

OpenCV

cv2.threshold(src, thresh, maxval, type) → retval, dst

1	cv2.threshold(src, thresh, maxval, type) → retval, dst

This function returns the thresholded image(dst) and the threshold value(retval). Its arguments are

src: input greyscale image (8-bit or 32-bit floating point)
thresh: global threshold value
type: Different types that decide “val_high” and “val_low“. In other words, these types decide what value to assign for pixels greater than and less than the threshold. Below figure shows different thresholding types available.
maxval: maximum value to be used with THRESH_BINARY and THRESH_BINARY_INV. Check the below image.

To specify the thresholding type, write “cv2.” as the prefix. For instance, write cv2.THRESH_BINARY if you want to use this type. Let’s take an example

import cv2
# Load an image in the greyscale
img = cv2.imread('D:/downloads/opencv_logo1.PNG',cv2.IMREAD_GRAYSCALE)
# threshold the image
ret, thresh = cv2.threshold(img,75,255,cv2.THRESH_BINARY)

import cv2

# Load an image in the greyscale

img = cv2.imread('D:/downloads/opencv_logo1.PNG',cv2.IMREAD_GRAYSCALE)

# threshold the image

ret, thresh = cv2.threshold(img,75,255,cv2.THRESH_BINARY)

Similarly, you can apply other thresholding types to check how they work. Till now we discussed how to threshold an image using a global threshold value. But we didn’t discuss how to get this threshold value. So, in the next section, let’s discuss this.

How to choose the threshold value?

As already discussed, that global thresholding is a suitable approach only when intensity distributions of the background and the ROI are sufficiently distinct. In other words, there is a clear valley between the peaks of the histogram. We can easily select the threshold value in that situation. But what if we have a number of images. In that case, we don’t manually want to first check the image histogram and then deciding the threshold value. We want something that can automatically estimate the threshold value for each image. Below is the algorithm that can be used for this purpose.

Below is the code for the above algorithm.

def thres_finder(img, thres=20,delta_T=1.0):
    
    # Step-2: Divide the images in two parts
    x_low, y_low = np.where(img<=thres)
    x_high, y_high = np.where(img>thres)
    
    # Step-3: Find the mean of two parts
    mean_low = np.mean(img[x_low,y_low])
    mean_high = np.mean(img[x_high,y_high])
    
    # Step-4: Calculate the new threshold
    new_thres = (mean_low + mean_high)/2
    
    # Step-5: Stopping criteria, otherwise iterate
    if abs(new_thres-thres)< delta_T:
        return new_thres
    else:
        return thres_finder(img, thres=new_thres,delta_T=1.0)

def thres_finder(img, thres=20,delta_T=1.0):

# Step-2: Divide the images in two parts

x_low, y_low = np.where(img<=thres)

x_high, y_high = np.where(img>thres)

# Step-3: Find the mean of two parts

mean_low = np.mean(img[x_low,y_low])

mean_high = np.mean(img[x_high,y_high])

# Step-4: Calculate the new threshold

new_thres = (mean_low + mean_high)/2

# Step-5: Stopping criteria, otherwise iterate

if abs(new_thres-thres)< delta_T:

return new_thres

else:

return thres_finder(img, thres=new_thres,delta_T=1.0)

Now, let’s take an example to check how’s this working.

import cv2
# Load an image in the greyscale
img = cv2.imread('D:/downloads/opencv_logo1.PNG',cv2.IMREAD_GRAYSCALE)
# apply threshold finder
vv1 = thres_finder1(img, thres=30,delta_T=1.0)
# threshold the image
ret, thresh = cv2.threshold(img,vv1,255,cv2.THRESH_BINARY)
# Display the image side by side
out = cv2.hconcat([img,thresh])
cv2.imshow('threshold',out)
cv2.waitKey(0)

import cv2

# Load an image in the greyscale

img = cv2.imread('D:/downloads/opencv_logo1.PNG',cv2.IMREAD_GRAYSCALE)

# apply threshold finder

vv1 = thres_finder1(img, thres=30,delta_T=1.0)

# threshold the image

ret, thresh = cv2.threshold(img,vv1,255,cv2.THRESH_BINARY)

# Display the image side by side

out = cv2.hconcat([img,thresh])

cv2.imshow('threshold',out)

cv2.waitKey(0)

That’s all for this blog. In the next blog, we will discuss how to perform optimum global thresholding using Otsu’s method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Thresholding

Leave a reply

Image Segmentation is the process of subdividing an image into its constituent regions or objects. In many computer vision applications, image segmentation is very useful to detect the region of interest. For instance, in medical imaging where we have to locate tumors, or in object detection like self-driving cars have to detect pedestrians, traffic signals, etc or for video surveillance, etc. There are a number of methods available to perform image segmentation. For instance, thresholding, clustering methods, graph partitioning methods, and convolutional methods to mention a few.

In this blog, we will discuss Image Thresholding which is one of the simplest methods for image segmentation. In this, we partition the images directly into regions based on the intensity values. So, let’s discuss image thresholding in greater detail.

Concept

If the pixel value is greater than a threshold value, it is assigned one value (maybe white), else it is assigned another value (maybe black).

In other words, if f(x,y) is the input image then the segmented image g(x,y) is given by

If the threshold value T remains constant over the entire image, then this is known as global thresholding. When the value of T changes over the entire image or depends upon the pixel neighborhood, then this is known as adaptive thresholding. We will cover both these types in greater detail in the following blogs.

Applicability Condition

Thresholding is only guaranteed to work when a good contrast ratio between the region of interest and the background exists. Otherwise, the thresholding will not be able to fully detect the region of interest. Let’s understand this by an example.

Suppose we have two images from which we want to segment the square region (our region of interest) from the background.

Let’s plot the histogram of these two images.

Clearly as expected for “A“, the histogram is showing two peaks corresponding to the square and the background. The separation between the peaks shows that the background and ROI have a good contrast ratio. By choosing a threshold value between the peaks, we will be able to segment out the ROI. While for “B”, the intensity distribution of the ROI and the background is not that distinct. Thus we may not be able to fully segment the ROI.

Thresholded images are shown below (How to choose a threshold value will be discussed in the next blog).

So, always plot the image histogram to check the contrast ratio between the background and the ROI. Only if the contrast ratio is good, choose the thresholding method for image segmentation. Otherwise, look for other methods.

In the next blog, we will discuss global thresholding and how to choose the threshold value using the iterative method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Laplacian of Gaussian (LoG)

2 Replies

In the previous blog, we discuss various first-order derivative filters. In this blog, we will discuss the Laplacian of Gaussian (LoG), a second-order derivative filter. So, let’s get started

Mathematically, the Laplacian is defined as

Unlike first-order filters that detect the edges based on local maxima or minima, Laplacian detects the edges at zero crossings i.e. where the value changes from negative to positive and vice-versa.

Let’s obtain kernels for Laplacian similar to how we obtained kernels using finite difference approximations for the first-order derivative.

Adding these two kernels together we obtain the Laplacian kernel as shown below

This is called a negative Laplacian because the central peak is negative. Other variants of Laplacian can be obtained by weighing the pixels in the diagonal directions also. Make sure that the sum of all kernel elements is zero so that the filter gives zero response in the homogeneous regions.

Let’s now discuss some properties of the Laplacian

Unlike first-order that requires two masks for finding edges, Laplacian uses 1 mask but the edge orientation information is lost in Laplacian.
Laplacian gives better edge localization as compared to first-order.
Unlike first-order, Laplacian is an isotropic filter i.e. it produces a uniform edge magnitude for all directions.
Similar to first-order, Laplacian is also very sensitive to noise

To reduce the noise effect, image is first smoothed with a Gaussian filter and then we find the zero crossings using Laplacian. This two-step process is called the Laplacian of Gaussian (LoG) operation.

But this can also be performed in one step. Instead of first smoothing an image with a Gaussian kernel and then taking its Laplace, we can obtain the Laplacian of the Gaussian kernel and then convolve it with the image. This is shown below where f is the image and g is the Gaussian kernel.

Now, let’s see how to obtain LoG kernel. Mathematically, LoG can be written as

The LoG kernel weights can be sampled from the above equation for a given standard deviation, just as we did in Gaussian Blurring. Just convolve the kernel with the image to obtain the desired result, as easy as that.

Select the size of the Gaussian kernel carefully. If LoG is used with small Gaussian kernel, the result can be noisy. If you use a large Gaussian kernel, you may get poor edge localization.

Now, let’s see how to do this using OpenCV-Python

OpenCV-Python

OpenCV provides a builtin function that calculates the Laplacian of an image. You can find it here. Below is the basic syntax of what this function looks like

cv2.Laplacian(src, ddepth[, ksize[, scale[, delta[, borderType]]]])
# src - input image
# ddepth - Desired depth of the destination image.
# ksize - kernel size

cv2.Laplacian(src, ddepth[, ksize[, scale[, delta[, borderType]]]])

# src - input image

# ddepth - Desired depth of the destination image.

# ksize - kernel size

Steps for LoG:

Apply LoG on the image. This can be done in two ways:
- First, apply Gaussian and then Laplacian or
- Convolve the image with LoG kernel directly
Find the zero crossings in the image
Threshold the zero crossings to extract only the strong edges.

Let’s understand each step through code

import cv2
import numpy as np

# Load the image in greyscale
img = cv2.imread('D:/downloads/clouds.jpg',0)

# Apply Gaussian Blur
blur = cv2.GaussianBlur(img,(3,3),0)

# Apply Laplacian operator in some higher datatype
laplacian = cv2.Laplacian(blur,cv2.CV_64F)

import cv2

import numpy as np

# Load the image in greyscale

img = cv2.imread('D:/downloads/clouds.jpg',0)

# Apply Gaussian Blur

blur = cv2.GaussianBlur(img,(3,3),0)

# Apply Laplacian operator in some higher datatype

laplacian = cv2.Laplacian(blur,cv2.CV_64F)

Since zero crossings is a change from negative to positive and vice-versa, so an approximate way is to clip the negative values to find the zero crossings.

# But this tends to localize the edge towards the brighter side.
laplacian1 = laplacian/laplacian.max()

cv2.imshow('a7',laplacian1)
cv2.waitKey(0)

# But this tends to localize the edge towards the brighter side.

laplacian1 = laplacian/laplacian.max()

cv2.imshow('a7',laplacian1)

cv2.waitKey(0)

Another way is to check each pixel for zero crossing as shown below

def Zero_crossing(image):
    z_c_image = np.zeros(image.shape)
    
    # For each pixel, count the number of positive
    # and negative pixels in the neighborhood
    
    for i in range(1, image.shape[0] - 1):
        for j in range(1, image.shape[1] - 1):
            negative_count = 0
            positive_count = 0
            neighbour = [image[i+1, j-1],image[i+1, j],image[i+1, j+1],image[i, j-1],image[i, j+1],image[i-1, j-1],image[i-1, j],image[i-1, j+1]]
            d = max(neighbour)
            e = min(neighbour)
            for h in neighbour:
                if h>0:
                    positive_count += 1
                elif h<0:
                    negative_count += 1


            # If both negative and positive values exist in 
            # the pixel neighborhood, then that pixel is a 
            # potential zero crossing
            
            z_c = ((negative_count > 0) and (positive_count > 0))
            
            # Change the pixel value with the maximum neighborhood
            # difference with the pixel

            if z_c:
                if image[i,j]>0:
                    z_c_image[i, j] = image[i,j] + np.abs(e)
                elif image[i,j]<0:
                    z_c_image[i, j] = np.abs(image[i,j]) + d
                
    # Normalize and change datatype to 'uint8' (optional)
    z_c_norm = z_c_image/z_c_image.max()*255
    z_c_image = np.uint8(z_c_norm)

    return z_c_image

def Zero_crossing(image):

z_c_image = np.zeros(image.shape)

# For each pixel, count the number of positive

# and negative pixels in the neighborhood

for i in range(1, image.shape[0] - 1):

for j in range(1, image.shape[1] - 1):

negative_count = 0

positive_count = 0

neighbour = [image[i+1, j-1],image[i+1, j],image[i+1, j+1],image[i, j-1],image[i, j+1],image[i-1, j-1],image[i-1, j],image[i-1, j+1]]

d = max(neighbour)

e = min(neighbour)

for h in neighbour:

if h>0:

positive_count += 1

elif h<0:

negative_count += 1

# If both negative and positive values exist in

# the pixel neighborhood, then that pixel is a

# potential zero crossing

z_c = ((negative_count > 0) and (positive_count > 0))

# Change the pixel value with the maximum neighborhood

# difference with the pixel

if z_c:

if image[i,j]>0:

z_c_image[i, j] = image[i,j] + np.abs(e)

elif image[i,j]<0:

z_c_image[i, j] = np.abs(image[i,j]) + d

# Normalize and change datatype to 'uint8' (optional)

z_c_norm = z_c_image/z_c_image.max()*255

z_c_image = np.uint8(z_c_norm)

return z_c_image

Depending upon the image you may need to apply thresholding and median blurring to suppress the noise.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

First-order Derivative kernels for Edge Detection

Leave a reply

In the previous blog, we briefly discussed that an edge can be detected by

First derivative (local maximum or minimum)
Second derivative (zero crossings)

In this blog, let’s discuss in detail how we can detect edges using the first order derivative.

Remember that derivatives only exists for continuous functions but the image is a discrete 2D light intensity function. Thus in the last blog, we approximated the image gradients using finite approximation as

For the edge detection case, we will prefer the central difference as shown above. Using this central difference, we can obtain the derivative filter in x and y directions as shown below

Here, we have assumed that the x-coordinate is increasing in the “right”-direction, and y-coordinate in the “down”-direction. By weighting these x and y derivatives, we can obtain different edge detection filters. Let’s see how

1. Sobel Operator

This is obtained by multiplying the x, and y-derivative filters obtained above with some smoothing filter(1D) in the other direction. For example, a 3×3 Sobel-x and Sobel-y filter can be obtained as

As we know that the Gaussian filter is used for blurring thus, the Sobel operator computes the gradient with smoothing. Thus this is less sensitive to noise. Because of separability property of the kernel, the Sobel operator is computationally efficient.

Note: Using larger Sobel kernels leads to more edge blurring, thus some form of edge thinning must be applied to counter this.

When we convolve these Sobel operators with the image, they estimate the gradients in the x, and y-directions(say G_x and G_y). For each point, we can calculate the gradient magnitude and direction as

We can easily infer that the edge direction or the angle will be positive for the transition from dark to white and negative otherwise. Now, let’s see how to do this using OpenCV-Python

OpenCV has a builtin function that calculates the image derivatives using the Sobel operator. Its basic syntax is shown below. You can read more about it here.

cv2.Sobel(src, ddepth, dx, dy[, ksize[, scale[, delta[, borderType]]]]])
# src - input image
# ddepth - depth of the output image
# dx and dy specify whether Sobel-x or Sobel -y is to be used
# ksize - kernel size

cv2.Sobel(src, ddepth, dx, dy[, ksize[, scale[, delta[, borderType]]]]])

# src - input image

# ddepth - depth of the output image

# dx and dy specify whether Sobel-x or Sobel -y is to be used

# ksize - kernel size

Note: Earlier we have assumed that white to black transition yields negative values. Thus if our output datatype is cv2.CV_8U or np.uint8, this will make all negative values 0. To prevent this, we specify the output datatype to some higher forms, like cv2.CV_16S, cv2.CV_64F etc, take its absolute value and then convert back to cv2.CV_8U.

import cv2
import numpy as np

# Create a image with black background and white circle
img = np.zeros((500,500),dtype='uint8') 
cv2.circle(img,(250,250), 150, (255,255,255), -1)

# Output dtype = cv2.CV_8U
sob_8u = cv2.Sobel(img,cv2.CV_8U,1,0,ksize=3)

# Output dtype = cv2.CV_64F. Then take its absolute and convert to cv2.CV_8U
sobel_64 = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=3)
abs_64 = np.absolute(sobel_64)
sobel_8u = np.uint8(abs_64)

cv2.imshow('a',sob_8u)
cv2.imshow('a1',sobel_8u)
cv2.waitKey(0)

import cv2

import numpy as np

# Create a image with black background and white circle

img = np.zeros((500,500),dtype='uint8')

cv2.circle(img,(250,250), 150, (255,255,255), -1)

# Output dtype = cv2.CV_8U

sob_8u = cv2.Sobel(img,cv2.CV_8U,1,0,ksize=3)

# Output dtype = cv2.CV_64F. Then take its absolute and convert to cv2.CV_8U

sobel_64 = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=3)

abs_64 = np.absolute(sobel_64)

sobel_8u = np.uint8(abs_64)

cv2.imshow('a',sob_8u)

cv2.imshow('a1',sobel_8u)

cv2.waitKey(0)

The output is shown below

2. Scharr Operator

This operator tries to achieve the perfect rotational symmetry. The 3×3 Scharr filter is shown below

OpenCV provides a builtin function for this

cv2.Scharr(src, ddepth, dx, dy[, scale[, delta[, borderType]]]])

1	cv2.Scharr(src, ddepth, dx, dy[, scale[, delta[, borderType]]]])

3. Prewitt Operator

In this, the x, and y-derivative filters are weighted with the standard averaging filter as shown below

Here, we discussed only the most common filters. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Canny Edge Detector

1 Reply

In this blog, we will discuss one of the most popular algorithms for edge detection known as Canny Edge detection. It was developed by John F. Canny in 1986. It is a multi-stage algorithm that provides good and reliable detection. So, let’s discuss the main steps used in the Canny Edge detection algorithm using OpenCV-Python.

1. Noise Reduction

An edge detector is a high pass filter that enhances the high-frequency component and suppresses the low ones. Since both edges and noise are high-frequency components, the edge detectors tend to amplify the noise. To prevent this, we smooth the image with a low-pass filter. Canny uses a Gaussian filter for this.

Below is the code for this using OpenCV-Python

import cv2
import numpy as np

img = cv2.imread('D:/downloads/child.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Gaussian Blurring
blur = cv2.GaussianBlur(gray,(5,5),0)

import cv2

import numpy as np

img = cv2.imread('D:/downloads/child.jpg')

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Gaussian Blurring

blur = cv2.GaussianBlur(gray,(5,5),0)

A larger filter reduces noise but worsens edge localization and vice-versa. Generally, 5×5 is a good choice but this may vary from image to image.

2. Finding Intensity Gradient of the Image

Next step is to find the edges using a Sobel operator. Sobel finds the gradients in both horizontal(Gx) and vertical(Gy) direction. Since edges are perpendicular to the gradient direction, using these gradients we can find the edge gradient and direction for each pixel as:

Below is the code for this using OpenCV-Python (Here, I’ve converted everything to 8-bit, it’s optional you can use any output datatype)

# Apply Sobelx in high output datatype 'float32'
# and then converting back to 8-bit to prevent overflow
sobelx_64 = cv2.Sobel(blur,cv2.CV_32F,1,0,ksize=3)
absx_64 = np.absolute(sobelx_64)
sobelx_8u1 = absx_64/absx_64.max()*255
sobelx_8u = np.uint8(sobelx_8u1)

# Similarly for Sobely
sobely_64 = cv2.Sobel(blur,cv2.CV_32F,0,1,ksize=3)
absy_64 = np.absolute(sobely_64)
sobely_8u1 = absy_64/absy_64.max()*255
sobely_8u = np.uint8(sobely_8u1)

# From gradients calculate the magnitude and changing
# it to 8-bit (Optional)
mag = np.hypot(sobelx_8u, sobely_8u)
mag = mag/mag.max()*255
mag = np.uint8(mag)

# Find the direction and change it to degree
theta = np.arctan2(sobely_64, sobelx_64)
angle = np.rad2deg(theta)

# Apply Sobelx in high output datatype 'float32'

# and then converting back to 8-bit to prevent overflow

sobelx_64 = cv2.Sobel(blur,cv2.CV_32F,1,0,ksize=3)

absx_64 = np.absolute(sobelx_64)

sobelx_8u1 = absx_64/absx_64.max()*255

sobelx_8u = np.uint8(sobelx_8u1)

# Similarly for Sobely

sobely_64 = cv2.Sobel(blur,cv2.CV_32F,0,1,ksize=3)

absy_64 = np.absolute(sobely_64)

sobely_8u1 = absy_64/absy_64.max()*255

sobely_8u = np.uint8(sobely_8u1)

# From gradients calculate the magnitude and changing

# it to 8-bit (Optional)

mag = np.hypot(sobelx_8u, sobely_8u)

mag = mag/mag.max()*255

mag = np.uint8(mag)

# Find the direction and change it to degree

theta = np.arctan2(sobely_64, sobelx_64)

angle = np.rad2deg(theta)

Clearly, we can see that the edges are still quite blurred or thick. Remember that an edge detector should output only one accurate response corresponding to the edge. Thus we need to thin the edges or in other words find the largest edge. This is done using Non-max Suppression.

3. Non-Max Suppression

This is an edge thinning technique. In this, for each pixel, we check if it is a local maximum in its neighborhood in the direction of gradient or not. If it is a local maximum it is retained as an edge pixel, otherwise suppressed.

For each pixel, the neighboring pixels are located in horizontal, vertical, and diagonal directions (0°, 45°, 90°, and 135°). Thus we need to round off the gradient direction at every pixel to one of these directions as shown below.

After rounding, we will compare every pixel value against the two neighboring pixels in the gradient direction. If that pixel is a local maximum, it is retained as an edge pixel otherwise suppressed. This way only the largest responses will be left.

Let’s see an example

Suppose for a pixel ‘A’, the gradient direction comes out to be 17 degrees. Since 17 is nearer to 0, we will round it to 0 degrees. Then we select neighboring pixels in the rounded gradient direction (See B and C in below figure). If the intensity value of A is greater than that of B and C, it is retained as an edge pixel otherwise suppressed.

Let’s see how to do this using OpenCV-Python

# Find the neighbouring pixels (b,c) in the rounded gradient direction
# and then apply non-max suppression
M, N = mag.shape
Non_max = np.zeros((M,N), dtype= np.uint8)

for i in range(1,M-1):
    for j in range(1,N-1):
       # Horizontal 0
        if (0 <= angle[i,j] < 22.5) or (157.5 <= angle[i,j] <= 180) or (-22.5 <= angle[i,j] < 0) or (-180 <= angle[i,j] < -157.5):
            b = mag[i, j+1]
            c = mag[i, j-1]
        # Diagonal 45
        elif (22.5 <= angle[i,j] < 67.5) or (-157.5 <= angle[i,j] < -112.5):
            b = mag[i+1, j+1]
            c = mag[i-1, j-1]
        # Vertical 90
        elif (67.5 <= angle[i,j] < 112.5) or (-112.5 <= angle[i,j] < -67.5):
            b = mag[i+1, j]
            c = mag[i-1, j]
        # Diagonal 135
        elif (112.5 <= angle[i,j] < 157.5) or (-67.5 <= angle[i,j] < -22.5):
            b = mag[i+1, j-1]
            c = mag[i-1, j+1]           
            
        # Non-max Suppression
        if (mag[i,j] >= b) and (mag[i,j] >= c):
            Non_max[i,j] = mag[i,j]
        else:
            Non_max[i,j] = 0

# Find the neighbouring pixels (b,c) in the rounded gradient direction

# and then apply non-max suppression

M, N = mag.shape

Non_max = np.zeros((M,N), dtype= np.uint8)

for i in range(1,M-1):

for j in range(1,N-1):

# Horizontal 0

if (0 <= angle[i,j] < 22.5) or (157.5 <= angle[i,j] <= 180) or (-22.5 <= angle[i,j] < 0) or (-180 <= angle[i,j] < -157.5):

b = mag[i, j+1]

c = mag[i, j-1]

# Diagonal 45

elif (22.5 <= angle[i,j] < 67.5) or (-157.5 <= angle[i,j] < -112.5):

b = mag[i+1, j+1]

c = mag[i-1, j-1]

# Vertical 90

elif (67.5 <= angle[i,j] < 112.5) or (-112.5 <= angle[i,j] < -67.5):

b = mag[i+1, j]

c = mag[i-1, j]

# Diagonal 135

elif (112.5 <= angle[i,j] < 157.5) or (-67.5 <= angle[i,j] < -22.5):

b = mag[i+1, j-1]

c = mag[i-1, j+1]

# Non-max Suppression

if (mag[i,j] >= b) and (mag[i,j] >= c):

Non_max[i,j] = mag[i,j]

else:

Non_max[i,j] = 0

Clearly, we can see that the edges are thinned but some edges are more bright than others. The brighter ones can be considered as strong edges but the lighter ones can actually be edges or they can be because of noise.

4. Hysteresis Thresholding

Non-max suppression outputs a more accurate representation of real edges in an image. But you can see that some edges are more bright than others. The brighter ones can be considered as strong edges but the lighter ones can actually be edges or they can be because of noise. To solve the problem of “which edges are really edges and which are not” Canny uses the Hysteresis thresholding. In this, we set two thresholds ‘High’ and ‘Low’.

Any edges with intensity greater than ‘High’ are the sure edges.
Any edges with intensity less than ‘Low’ are sure to be non-edges.
The edges between ‘High’ and ‘Low’ thresholds are classified as edges only if they are connected to a sure edge otherwise discarded.

Let’s take an example to understand

Here, A and B are sure-edges as they are above ‘High’ threshold. Similarly, D is a sure non-edge. Both ‘E’ and ‘C’ are weak edges but since ‘C’ is connected to ‘B’ which is a sure edge, ‘C’ is also considered as a strong edge. Using the same logic ‘E’ is discarded. This way we will get only the strong edges in the image.

This is based on the assumption that the edges are long lines.

Below is the code using OpenCV-Python.

First set the thresholds and classify edges into strong, weak or non-edges.

# Set high and low threshold
highThreshold = 21
lowThreshold = 15

M, N = Non_max.shape
out = np.zeros((M,N), dtype= np.uint8)

# If edge intensity is greater than 'High' it is a sure-edge
# below 'low' threshold, it is a sure non-edge
strong_i, strong_j = np.where(Non_max >= highThreshold)
zeros_i, zeros_j = np.where(Non_max < lowThreshold)

# weak edges
weak_i, weak_j = np.where((Non_max <= highThreshold) & (Non_max >= lowThreshold))

# Set same intensity value for all edge pixels
out[strong_i, strong_j] = 255
out[zeros_i, zeros_j ] = 0
out[weak_i, weak_j] = 75

# Set high and low threshold

highThreshold = 21

lowThreshold = 15

M, N = Non_max.shape

out = np.zeros((M,N), dtype= np.uint8)

# If edge intensity is greater than 'High' it is a sure-edge

# below 'low' threshold, it is a sure non-edge

strong_i, strong_j = np.where(Non_max >= highThreshold)

zeros_i, zeros_j = np.where(Non_max < lowThreshold)

# weak edges

weak_i, weak_j = np.where((Non_max <= highThreshold) & (Non_max >= lowThreshold))

# Set same intensity value for all edge pixels

out[strong_i, strong_j] = 255

out[zeros_i, zeros_j ] = 0

out[weak_i, weak_j] = 75

For weak edges, if it is connected to a sure edge it will be considered as an edge otherwise suppressed.

M, N = out.shape
for i in range(1, M-1):
    for j in range(1, N-1):
        if (out[i,j] == 75):
            if 255 in [out[i+1, j-1],out[i+1, j],out[i+1, j+1],out[i, j-1],out[i, j+1],out[i-1, j-1],out[i-1, j],out[i-1, j+1]]:
                out[i, j] = 255
            else:
                out[i, j] = 0

M, N = out.shape

for i in range(1, M-1):

for j in range(1, N-1):

if (out[i,j] == 75):

if 255 in [out[i+1, j-1],out[i+1, j],out[i+1, j+1],out[i, j-1],out[i, j+1],out[i-1, j-1],out[i-1, j],out[i-1, j+1]]:

out[i, j] = 255

else:

out[i, j] = 0

OpenCV-Python

OpenCV provides a builtin function for performing Canny Edge detection

cv2.Canny(image, threshold1, threshold2[, apertureSize[, L2gradient]]])
# threshold1 and threshold2 are the High and Low threshold values
# apertureSize - Kernel size for the Sobel operator (Default is 3x3)
# L2gradient - whether to use L2norm for gradient magnitude calculation or not. Default is False that uses L1 norm.

cv2.Canny(image, threshold1, threshold2[, apertureSize[, L2gradient]]])

# threshold1 and threshold2 are the High and Low threshold values

# apertureSize - Kernel size for the Sobel operator (Default is 3x3)

# L2gradient - whether to use L2norm for gradient magnitude calculation or not. Default is False that uses L1 norm.

Let’s take an example

import cv2
img = cv2.imread('D:/downloads/child.jpg',0)
edges = cv2.Canny(img,100,200,L2gradient=True)

import cv2

img = cv2.imread('D:/downloads/child.jpg',0)

edges = cv2.Canny(img,100,200,L2gradient=True)

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Unsharp Masking and Highboost filtering

1 Reply

In this blog, we will learn how we can sharpen an image or perform edge enhancement using a smoothing filter. Let’s see how this is done

First, we blur the image. We know by smoothing an image we suppress most of the high-frequency components.
Then, we subtract this smoothed image from the original image(the resulting difference is known as a mask). Thus, the output image will have most of the high-frequency components that are blocked by the smoothing filter.
Adding this mask back to the original will enhance the high-frequency components.

Because we are using a blurred or unsharp image to create a mask this technique is known as Unsharp Masking.

Thus, unsharp masking first produces a mask m(x,y) as

where, f(x,y) is the original image and f_b(x,y) is the blurred version of the original image.

Then this mask is added back to the original image which results in enhancing the high-frequency components.

where k specifies what portion of the mask to be added. When k= 1 this is known as Unsharp masking. For k>1 we call this as high-boost filtering because we are boosting the high-frequency components by giving more weight to the masked (edge) image.

We can also write the above two equations into one as the weighted average of the original and the blurred image.

Note: Instead of subtracting the blurred image from the original, we can directly use a negative Laplacian filter to obtain the mask.

If the image contains noise, this method will not produce satisfactory results, like most of the other sharpening filters.

Let’s see how to do this using OpenCV-Python

OpenCV-Python

Since in the last equation we described unsharp masking as the weighted average of the original and the input image, we will simply use OpenCV cv2.addWeighted() function.

import cv2

# Load the image
image = cv2.imread("D:/downloads/kang.jpg")
# Blur the image
gauss = cv2.GaussianBlur(image, (7,7), 0)
# Apply Unsharp masking
unsharp_image = cv2.addWeighted(image, 2, gauss, -1, 0)

import cv2

# Load the image

image = cv2.imread("D:/downloads/kang.jpg")

# Blur the image

gauss = cv2.GaussianBlur(image, (7,7), 0)

# Apply Unsharp masking

unsharp_image = cv2.addWeighted(image, 2, gauss, -1, 0)

The output is shown below

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Difference of Gaussians (DoG)

Leave a reply

In the previous blog, we discussed Gaussian Blurring that uses Gaussian kernels for image smoothing. This is a low pass filtering technique that blocks high frequencies (like edges, noise, etc.). In this blog, we will see how we can use this Gaussian Blurring to highlight certain high-frequency parts in an image. Isn’t that interesting? So, let’s get started.

In Gaussian Blurring, we discussed how the standard deviation of the Gaussian affects the degree of smoothing. Roughly speaking, larger the standard deviation more will be the blurring or in other words more high frequency components will be suppressed.

Thus if we take 2 Gaussian kernels with different standard deviations, apply separately on the same image and subtract their corresponding responses, we will get an output that highlights certain high-frequency components based on the standard deviations used.

The logic is by blurring we remove some high-frequency components that represent noise, and by subtracting we remove some low-frequency components that correspond to the homogeneous areas in the image. All the remaining frequency components are assumed to be associated with the edges in the images. Thus, the Difference of Gaussian acts like a bandpass filter. Let’s take an example to understand this.

Suppose we have an image as shown below

Suppose we have 2 Gaussian kernels with standard deviation (σ₁ > σ₂). The kernel (with σ₁), when convolved with an image, will blur the high-frequency components more as compared to the other kernel. Subtracting these, we can recover the information that lies between the frequency range which is not suppressed or blurred.

Now, that we saw how this works, let’s also discuss where this is useful(its pros and cons) as compared to other edge detection methods we have discussed.

All the edge detection kernels which we discussed till now are quite good in edge detection but one downside is that they are highly susceptible to noise. Thus if the image contains a high degree of noise, Difference of Gaussian is the way to go. This is because we are actually doing blurring which reduces the effect of noise to a great extent.

One downside of this method is that the edges are not enhanced much as compared to other methods. The output image formed has lower contrast.

OpenCV-Python

Let’s see how to do this using OpenCV-Python

import cv2
img = cv2.imread('D:/downloads/opencv_logo.PNG')

# Apply 3x3 and 7x7 Gaussian blur
low_sigma = cv2.GaussianBlur(img,(3,3),0)
high_sigma = cv2.GaussianBlur(img,(5,5),0)

# Calculate the DoG by subtracting
dog = low_sigma - high_sigma

import cv2

img = cv2.imread('D:/downloads/opencv_logo.PNG')

# Apply 3x3 and 7x7 Gaussian blur

low_sigma = cv2.GaussianBlur(img,(3,3),0)

high_sigma = cv2.GaussianBlur(img,(5,5),0)

# Calculate the DoG by subtracting

dog = low_sigma - high_sigma

Note: Using the linear separable property of the Gaussian kernel we can speed up the entire algorithm.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Image Gradients

Leave a reply

In the previous blogs, we discussed different smoothing filters. Before moving forward, let’s first discuss Image Gradients which will be useful in edge detection, robust feature and texture matching. So, let’s first recall what a gradient is.

In mathematics, the term gradient of a function means how a function is changing wrt. its arguments or independent variables. The gradient term is more frequently used for multi-variable functions. For a single variable function, we refer to this as the slope.

The gradient of an N-variable function at each point is an N-D vector with the components given by the derivatives in the N-directions. e.g. for a 3-variable function (f(x,y,z)), the gradient, if it exists, is given by

Thus, the gradient provides two pieces of information – magnitude and direction. The direction of the gradient tells us the direction of greatest increase while the magnitude represents the rate of increase in that direction.

Because gradients are defined only for continuous functions and Image is a 2-d discrete function (F(x,y)). Thus we need to approximate the gradients and we do this using Finite differences. In this instead of h approaching 0 we assume h to be a fixed (non-zero) value.

Three forms of finite differences are commonly used: forward, backward and central.

But for calculating Image Gradients, we use the central difference to approximate gradients in x and y directions. Below example shows how to calculate the central difference in the x-direction for 200.

In the next blog, we will discuss how to derive different kernels such as Sobel, Prewitt, etc from this central difference formulae and then using convolution to approximate the image gradients.

Thus, at each image point, the gradient vector points in the direction of largest possible intensity increase, and the magnitude corresponds to the rate of change in that direction. Thus for an image f(x,y), the gradient direction and magnitude is given by

Thus in simple words, image gradient in x-direction measures the horizontal change in intensity while the gradient in y measures the vertical change in intensity.

Since edges are an abrupt change in the intensity values thus the largest gradient values will occur across an edge (neglecting noise) in an image. Thus, the x-gradient will find the vertical edges while y-gradient will highlight the horizontal edges as shown below.

Thus, we may conclude that edges are perpendicular to the gradient direction(largest). That’s why gradients are used in edge detection.

In the next blog, we will discuss different edge detection filters. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

TheAILearner

Mastering Artificial Intelligence

Tag Archives: image processing

Optimum Global Thresholding using Otsu’s Method

OpenCV

A Faster Approach

Limitations

Improving Global Thresholding

Case-1: When the noise is present in the image

Case-2: When the object area is small compared to the background area

Case-3: When the image is taken under non-uniform illumination conditions

Global Thresholding

OpenCV

How to choose the threshold value?

Image Thresholding

Concept

Applicability Condition

Laplacian of Gaussian (LoG)

OpenCV-Python

Steps for LoG:

First-order Derivative kernels for Edge Detection

1. Sobel Operator

2. Scharr Operator

3. Prewitt Operator

Canny Edge Detector

1. Noise Reduction

2. Finding Intensity Gradient of the Image

3. Non-Max Suppression

4. Hysteresis Thresholding

OpenCV-Python

Unsharp Masking and Highboost filtering

OpenCV-Python

Difference of Gaussians (DoG)

OpenCV-Python

Understanding Image Gradients