Tag Archives: opencv python

Understanding Geometric Transformation: Translation using OpenCV-Python

In this blog, we will discuss image translation one of the most basic geometric transformations, that is performed on images. So, let’s get started.

Translation is simply the shifting of object location. Suppose we have a point P(x,y) which is translated by (t_x, t_y), then the coordinates after translation denoted by P'(x’,y’) are given by

So, we just need to create the transformation matrix (M) and then we can translate any point as shown above. That’s the basic idea behind translation. So, let’s first discuss how to do image translation using numpy for better understanding, and then we will see a more sophisticated implementation using OpenCV.

Numpy

First, let’s create the transformation matrix (M). This can be easily done using numpy as shown below. Here, the image is translated by (100, 50)

M = np.float32([[1,0,100],[0,1,50]])

1	M = np.float32([[1,0,100],[0,1,50]])

Next, let’s convert the image coordinates to the form [x,y,1]. This can be done as

# get the coordinates in the form of (0,0),(0,1)...
# the shape is (2, rows*cols)
orig_coord = np.indices((cols, rows)).reshape(2,-1)
# stack the rows of 1 to form [x,y,1]
orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

# get the coordinates in the form of (0,0),(0,1)...

# the shape is (2, rows*cols)

orig_coord = np.indices((cols, rows)).reshape(2,-1)

# stack the rows of 1 to form [x,y,1]

orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

Now apply the transformation by multiplying the transformation matrix with coordinates.

transform_coord = np.dot(M, orig_coord_f)
# Change into int type
transform_coord = transform_coord.astype(np.int)

transform_coord = np.dot(M, orig_coord_f)

# Change into int type

transform_coord = transform_coord.astype(np.int)

Keep only the coordinates that fall within the image boundary.

indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

1	indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

Now, create a zeros image similar to the original image and project all the points onto the new image.

img1 = np.zeros_like(img)
img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

1 2	img1 = np.zeros_like(img) img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

Display the final image.

out = cv2.hconcat([img,img1])
cv2.imshow('a1',out)
cv2.waitKey(0)

out = cv2.hconcat([img,img1])

cv2.imshow('a1',out)

cv2.waitKey(0)

The full code can be found below

import numpy as np
import cv2

# Read an image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows,cols,_ = img.shape

# Create the transformation matrix
M = np.float32([[1,0,100],[0,1,50]])
# get the coordinates in the form of (0,0),(0,1)...
# the shape is (2, rows*cols)
orig_coord = np.indices((cols, rows)).reshape(2,-1)
# stack the rows of 1 to form [x,y,1]
orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))
transform_coord = np.dot(M, orig_coord_f)
# Change into int type
transform_coord = transform_coord.astype(np.int)
# Keep only the coordinates that fall within the image boundary.
indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)
# Create a zeros image and project the points
img1 = np.zeros_like(img)
img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]
# Display the image
out = cv2.hconcat([img,img1])
cv2.imshow('a2',out)
cv2.waitKey(0)

import numpy as np

import cv2

# Read an image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows,cols,_ = img.shape

# Create the transformation matrix

M = np.float32([[1,0,100],[0,1,50]])

# get the coordinates in the form of (0,0),(0,1)...

# the shape is (2, rows*cols)

orig_coord = np.indices((cols, rows)).reshape(2,-1)

# stack the rows of 1 to form [x,y,1]

orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

transform_coord = np.dot(M, orig_coord_f)

# Change into int type

transform_coord = transform_coord.astype(np.int)

# Keep only the coordinates that fall within the image boundary.

indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

# Create a zeros image and project the points

img1 = np.zeros_like(img)

img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

# Display the image

out = cv2.hconcat([img,img1])

cv2.imshow('a2',out)

cv2.waitKey(0)

Below is the output. Here, left image represents the original image while the right one is the translated image.

OpenCV-Python

Now, let’s discuss how to translate images using OpenCV-Python.

OpenCV provides a function cv2.warpAffine() that applies an affine transformation to an image. You just need to provide the transformation matrix (M). The basic syntax for the function is given below.

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]	)

# src: input image
# M: Transformation matrix
# dsize: size of the output image
# flags: interpolation method to be used

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )

# src: input image

# M: Transformation matrix

# dsize: size of the output image

# flags: interpolation method to be used

Below is a sample code where the image is translated by (100, 50).

import numpy as np
import cv2

# Read an image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows,cols,_ = img.shape

# Create the transformation matrix
M = np.float32([[1,0,100],[0,1,50]])

# Pass it to warpAffine function
dst = cv2.warpAffine(img,M,(cols,rows))

# Display the concatenated image
out = cv2.hconcat([img, dst])
cv2.imshow('img',out)
cv2.waitKey(0)

import numpy as np

import cv2

# Read an image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows,cols,_ = img.shape

# Create the transformation matrix

M = np.float32([[1,0,100],[0,1,50]])

# Pass it to warpAffine function

dst = cv2.warpAffine(img,M,(cols,rows))

# Display the concatenated image

out = cv2.hconcat([img, dst])

cv2.imshow('img',out)

cv2.waitKey(0)

Below is the output. Here, left image represents the original image while the right one is the translated image.

Compare the outputs of both implementations. That’s all for image translation. In the next blog, we will discuss another geometric transformation known as rotation in detail. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Moments

3 Replies

In this blog, we will discuss how to find different features of contours such as area, centroid, orientation, etc. With the help of these features/statistics, we can do some sort of recognition. So, in this blog, we will refer to a very old fundamental work in computer vision known as Image moments that helps us to calculate these statistics. So, let’s first discuss what are image moments and how to calculate them.

In simple terms, image moments are a set of statistical parameters to measure the distribution of where the pixels are and their intensities. Mathematically, the image moment M_ij of order (i,j) for a greyscale image with pixel intensities I(x,y) is calculated as

Here, x, y refers to the row and column index and I(x,y) refers to the intensity at that location (x,y). Now, let’s discuss how simple image properties are calculated from image moments.

Area:

For a binary image, the zeroth order moment corresponds to the area. Let’s discuss how?

Using the above formulae, the zeroth order moment (M₀₀) is given by

For a binary image, this corresponds to counting all the non-zero pixels and that is equivalent to the area. For greyscale image, this corresponds to the sum of pixel intensity values.

Centroid:

Centroid simply is the arithmetic mean position of all the points. In terms of image moments, centroid is given by the relation

This is simple to understand. For instance, for a binary image M₁₀ corresponds to the sum of all non-zero pixels (x-coordinate) and M₀₀ is the total number of non-zero pixels and that is what the centroid is.

Let’s take a simple example to understand how to calculate image moments for a given image.

Below are the area and centroid calculation for the above image

OpenCV-Python

OpenCV provides a function cv2.moments() that outputs a dictionary containing all the moment values up to 3^rd order.

output = cv2.moments(input[,binaryImage])

# input: image(single channel) or array of 2D points. Should be either np.int32 or np.float32.
# binaryImage: Only used if input is image. If True all the non-zero pixels are treated as 1's.

output = cv2.moments(input[,binaryImage])

# input: image(single channel) or array of 2D points. Should be either np.int32 or np.float32.

# binaryImage: Only used if input is image. If True all the non-zero pixels are treated as 1's.

Below is the sample code that shows how to use cv2.moments().

import cv2
# read the image
img = cv2.imread('star.jpg',0)
# Binarize the image
ret,thresh = cv2.threshold(img,127,255,0)
# Find the contours
contours,hierarchy = cv2.findContours(thresh, 1, 2)
cnt = contours[0]
# Calculate the moments
M = cv2.moments(cnt)

import cv2

# read the image

img = cv2.imread('star.jpg',0)

# Binarize the image

ret,thresh = cv2.threshold(img,127,255,0)

# Find the contours

contours,hierarchy = cv2.findContours(thresh, 1, 2)

cnt = contours[0]

# Calculate the moments

M = cv2.moments(cnt)

From this moments dictionary, we can easily extract the useful features such as area, centroid etc. as shown below.

# Calculate area
area = M['m00']
# Calculate centroid
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])

# Calculate area

area = M['m00']

# Calculate centroid

cx = int(M['m10']/M['m00'])

cy = int(M['m01']/M['m00'])

That’s all about image moments. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Removing Text highlighter using Colorspace OpenCV-Python

Leave a reply

Have you ever thought why a number of color models are available in OpenCV? Obviously, they might have some pros and cons. So, in this blog, we will discuss one such application of color models where we will learn to remove the highlighted area from the text.

Use Case:

This pre-processing step (removing text highlighter) can be quite useful before feeding the image to an OCR system. Otherwise, the OCR system will output erroneous results.

Problem Overview

Suppose we are given an image as shown on the left and we want to pre-process it to remove the highlighter from the text as shown by the right image below.

Approach:

Since we know that there are some color models available (such as HSV) where it is easy to represent the color as compared to the RGB model. So, we will convert the image from RGB to that colorspace and then remove the color information. For instance, in the HSV color model, H and S tell us about the chromaticity (color information) of the light while V carries the greyscale information. So in HSV, if we remove the H and S channel and only keep the V channel we can obtain the desired results.

Steps:

Read the highlighted text image
Convert from BGR to HSV colorspace using cv2.cvtColor()
Extract the V channel

Code:

import cv2
# read the image
img = cv2.imread('D:/downloads/highlighted_text.JPG')
# Convert from BGR to HSV
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Extract the V channel
out_img = img_hsv[:,:,2]
# Display the image
cv2.imshow('output_image', out_img)
cv2.waitKey(0)

import cv2

# read the image

img = cv2.imread('D:/downloads/highlighted_text.JPG')

# Convert from BGR to HSV

img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Extract the V channel

out_img = img_hsv[:,:,2]

# Display the image

cv2.imshow('output_image', out_img)

cv2.waitKey(0)

So, you saw that just by changing the colorspace and extracting channels we obtained satisfactory results. We can further improve the results by applying other operations such as thresholding or morphological operations etc. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Finding Convex Hull OpenCV Python

1 Reply

In the previous blog, we discussed how to perform simple shape detection using contour approximation. In this blog, we will discuss how to find the convex hull of a given shape/curve. So, let’s first discuss what is a convex hull?

What is a Convex Hull?

Any region/shape is said to be convex if the line joining any two points (selected from the region) is contained entirely in that region. Another way of saying this is, for a shape to be convex, all of its interior angles must be less than 180 degrees or all the vertices should open towards the center. Let’s understand this with the help of the image below.

Now, for a given shape or set of points, we can have many convex curves/boundaries. The smallest or the tight-fitting convex boundary is known as a convex hull.

Now, the next question that comes to our mind is how to find the convex hull for a given shape or set of points? There are so many algorithms for finding the convex hull. Some of the most common algorithms with their associated time complexities are shown below. Here, n is the no. of input points and h is the number of points on the hull.

Gift wrapping, a.k.a. Jarvis march — O(nh)
Graham scan — O(nlogn)
Chan’s algorithm — O(nlogh)
Sklansky (1982) — O(nlogn) ( OpenCV uses this algorithm)

OpenCV provides a builtin function for finding the convex hull of a point set as shown below

hull = cv2.convexHull(points [,clockwise [,returnPoints]])

1	hull = cv2.convexHull(points [,clockwise [,returnPoints]])

points: any contour or Input 2D point set whose convex hull we want to find.
clockwise: If it is True, the output convex hull is oriented clockwise. Otherwise, counter-clockwise.
returnPoints: If True (default) then returns the coordinates of the hull points. Otherwise, returns the indices of contour points corresponding to the hull points. Thus to find the actual hull coordinates in the second(False) case, we need to do contour[indices].

Now, let’s take an example and understand how to find the convex hull for a given image using OpenCV-Python.

Steps:

Load the image
Convert it to greyscale
Threshold the image
Find the contours
For each contour, find the convex hull and draw it.

import cv2
# Load the image
img1 = cv2.imread('D:/downloads/cars.JPG')
# Convert it to greyscale
img = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
# Threshold the image
ret, thresh = cv2.threshold(img,50,255,0)
# Find the contours
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# For each contour, find the convex hull and draw it
# on the original image.
for i in range(len(contours)):
    hull = cv2.convexHull(contours[i])
    cv2.drawContours(img1, [hull], -1, (255, 0, 0), 2)
# Display the final convex hull image
cv2.imshow('ConvexHull', img1)
cv2.waitKey(0)

import cv2

# Load the image

img1 = cv2.imread('D:/downloads/cars.JPG')

# Convert it to greyscale

img = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)

# Threshold the image

ret, thresh = cv2.threshold(img,50,255,0)

# Find the contours

contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# For each contour, find the convex hull and draw it

# on the original image.

for i in range(len(contours)):

hull = cv2.convexHull(contours[i])

cv2.drawContours(img1, [hull], -1, (255, 0, 0), 2)

# Display the final convex hull image

cv2.imshow('ConvexHull', img1)

cv2.waitKey(0)

Below is the output of the above code.

Applications:

Collision detection or avoidance.
Face Swap
Shape analysis and many more.

Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Simple Shape Detection using Contour approximation

Leave a reply

In the previous blog, we learned how to find and draw contours using OpenCV. In this blog, we will discuss how to detect simple geometric shapes by approximating the contours. So, let’s first discuss what is meant by contour approximation.

This means approximating a contour shape to another shape with less number of vertices so that the distance between both the shapes is less or equal to the specified precision. The below figure shows the curve approximation for different precisions (epsilon). See how the shape is approximated to a rectangle with epsilon =10% in the below image.

Contour approximation for different epsilon — **Source: OpenCV**

This is widely used in robotics for pattern classification and scene analysis. OpenCV provides a builtin function that approximates the polygonal curves with the specified precision. Its implementation is based on the Douglas-Peucker algorithm.

approxCurve = cv2.approxPolyDP(curve, epsilon, closed)

1	approxCurve = cv2.approxPolyDP(curve, epsilon, closed)

“curve“: contour/polygon we want to approximate.
“epsilon“: This is the maximum distance between the original curve and its approximation.
“closed“: If true, the approximated curve is closed otherwise, not.

This function returns the approximated contour with the same type as that of the input curve. Now, let’s detect simple shapes using this concept. Let’s take the below image to perform shape detection.

Steps

Load the image and convert to greyscale
Apply thresholding and find contours
For each contour
- First, approximate its shape using cv2.approxPolyDP()
- if len(shape) == 3; shape is Triangle
- else if len(shape) == 4; shape is Rectangle
- else if len(shape) == 5; shape is Pentagon
- else if 6< len(shape) <15; shape is Ellipse
- else; shape is circle

Code

import cv2
import numpy as np

# Load the image
img = cv2.imread("D:/downloads/cont_new.png")
# Convert to greyscale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Convert to binary image by thresholding
_, threshold = cv2.threshold(img_gray, 245, 255, cv2.THRESH_BINARY_INV)
# Find the contours
contours, _ = cv2.findContours(threshold, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# For each contour approximate the curve and
# detect the shapes.
for cnt in contours:
    epsilon = 0.01*cv2.arcLength(cnt, True)
    approx = cv2.approxPolyDP(cnt, epsilon, True)
    cv2.drawContours(img, [approx], 0, (0), 3)
    # Position for writing text
    x,y = approx[0][0]

    if len(approx) == 3:
        cv2.putText(img, "Triangle", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)
    elif len(approx) == 4:
        cv2.putText(img, "Rectangle", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)
    elif len(approx) == 5:
        cv2.putText(img, "Pentagon", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)
    elif 6 < len(approx) < 15:
        cv2.putText(img, "Ellipse", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)
    else:
        cv2.putText(img, "Circle", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)
cv2.imshow("final", img)
cv2.waitKey(0)

import cv2

import numpy as np

# Load the image

img = cv2.imread("D:/downloads/cont_new.png")

# Convert to greyscale

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Convert to binary image by thresholding

_, threshold = cv2.threshold(img_gray, 245, 255, cv2.THRESH_BINARY_INV)

# Find the contours

contours, _ = cv2.findContours(threshold, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# For each contour approximate the curve and

# detect the shapes.

for cnt in contours:

epsilon = 0.01*cv2.arcLength(cnt, True)

approx = cv2.approxPolyDP(cnt, epsilon, True)

cv2.drawContours(img, [approx], 0, (0), 3)

# Position for writing text

x,y = approx[0][0]

if len(approx) == 3:

cv2.putText(img, "Triangle", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)

elif len(approx) == 4:

cv2.putText(img, "Rectangle", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)

elif len(approx) == 5:

cv2.putText(img, "Pentagon", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)

elif 6 < len(approx) < 15:

cv2.putText(img, "Ellipse", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)

else:

cv2.putText(img, "Circle", (x, y), cv2.FONT_HERSHEY_COMPLEX, 1, 0,2)

cv2.imshow("final", img)

cv2.waitKey(0)

Below is the final result.

Contour approximation for shape detection

Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Find and Draw Contours using OpenCV-Python

Leave a reply

In the previous blog, we discussed various contour tracing algorithms like Moore, radial sweep, etc. and then we also covered Suzuki’s algorithm, the one that OpenCV uses for border following or contour tracing. In this blog, we will discuss the builtin functions provided by OpenCV for finding and drawing contours. So, let’s get started.

Finding the contours

OpenCV provides the following builtin function for finding the contour

contours, hierarchy = cv2.findContours(image, mode, method)

1	contours, hierarchy = cv2.findContours(image, mode, method)

Here, the first argument “image” should be an 8-bit single-channel image. For better accuracy use a binary image. If you didn’t provide a binary image, then this method will convert it into a binary image by treating all the nonzero pixels as ‘1’ and zero remains ‘0’.

The second argument “mode” specifies how you want to retrieve the contours. This means whether you want to extract the outer contours only or retrieve contours without establishing any hierarchical relationships. Below are the several options available

cv2.RETR_EXTERNAL – retrieves only the extreme outer contours.
cv2.RETR_LIST – retrieves contours without establishing any hierarchical relationships.
cv2.RETR_TREE – constructs a full hierarchy of nested contours.
cv2.RETR_CCOMP – arranges all the contours into a 2-level hierarchy – outer contours and hole contours.

The third argument “method” denotes the contour approximation method. We don’t need to store all the points of a contour as the same thing can also be represented in a compact manner. For instance, a straight line can be represented by the endpoints. There is no need to store all the points as that would be redundant. OpenCV provides various options for this.

cv2.CHAIN_APPROX_NONE – stores all the boundary points.
cv2.CHAIN_APPROX_SIMPLE – removes all the redundant points and thus saves memory.
cv2.CHAIN_APPROX_TC89_L1 – applies one of the variants of the Teh-Chin chain approximation algorithm

The first output “contours” is a Python list of all the contours in the image. Each individual contour is a Numpy array of (x,y) coordinates of boundary points of the object.

The second output “hierarchy” represents the relationship among the contours like, is it a child of some other contour, or is it a parent, etc. OpenCV represents it as an array of four values : [Next, Previous, First_Child, Parent]

“Next denotes the next contour at the same hierarchical level.”
“Previous denotes the previous contour at the same hierarchical level.”
“First_Child denotes its first child contour.”
“Parent denotes index of its parent contour.”

Depending upon the contour retrieval mode argument hierarchy array can take different values. You can read more about it here.

Drawing the Contours

OpenCV provides the following builtin function for drawing the contour.

image = cv.drawContours(image, contours, Id, color[, thickness[, lineType[, hierarchy[, maxLevel[, offset]]]]])

1	image = cv.drawContours(image, contours, Id, color[, thickness[, lineType[, hierarchy[, maxLevel[, offset]]]]])

The first argument is the destination image on which to draw the contours, the second argument is the contours which should be passed as a Python list, the third argument is the index of contours that we want to draw(To draw all contours, pass -1). If the thickness ≥ 0, it draws contour outlines in the image otherwise, fills the area bounded by the contours. The optional argument hierarchy and the max-level specify up to which hierarchy level to draw the contours.

Now, let’s take an example to understand the above two functions.

import cv2
# Load the image
img = cv2.imread('D:/downloads/opencv_logo1.PNG')
# Convert to grayscale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Threshold the image to produce a binary image
ret, thresh = cv2.threshold(img_gray,20,255,0)
# Find the contours
img2, contours, heirarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Draw the contours 
cv2.drawContours(img, contours, -1, (127,127,0), 2)
# Display the image
cv2.imshow('a1',img)
cv2.waitKey(0)

import cv2

# Load the image

img = cv2.imread('D:/downloads/opencv_logo1.PNG')

# Convert to grayscale

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Threshold the image to produce a binary image

ret, thresh = cv2.threshold(img_gray,20,255,0)

# Find the contours

img2, contours, heirarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw the contours

cv2.drawContours(img, contours, -1, (127,127,0), 2)

# Display the image

cv2.imshow('a1',img)

cv2.waitKey(0)

Below is the output of the above code.

Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Suzuki’s Contour tracing algorithm OpenCV-Python

2 Replies

In the previous blog, we discussed various contour tracing algorithms like radial sweep, Moore’s, etc. In this blog, we will discuss another famous contour tracing algorithm known as Suzuki’s algorithm. Many of the image processing libraries such as OpenCV uses this border following algorithm for the topological structural analysis of the image. This was one of the first algorithms that define the hierarchical relationships among the borders. This algorithm also differentiates between the outer boundary or the hole boundary. Before discussing this algorithm, let’s understand what is the outer and hole border. The below figure explains this pretty well. (Here we will be dealing with binary images (0 or 1)).

Now, let’s understand the algorithm using the following image.

Let’s say f_ij denotes the value of the pixel at location (i,j). The uppermost row, the lowermost row, the leftmost column, and the rightmost column of a picture compose its frame. In this, we assign a unique number to every new border found and we denote it by NBD. We assume the NBD of the frame as 1. Rest borders are numbered sequentially. We save the information of the parent of any border in LNBD or last NBD.

Steps:

Start scanning the image from left to right until you find the object pixel. Decide whether it is an outer border or hole border. The criteria for checking the outer or hole border is shown in the image below. Thus while scanning if we found the situation as shown in the image below, we can easily tell whether it is the starting point of the outer or the hole border.

Perform the following steps only for pixels >0. Every time we begin to scan a new row, reset LNBD to 1.

Step-1

If it’s an outer border (i.e. f_ij = 1 and f_i,j-1 = 0) then increment the NBD and set (i2, j2) as (i, j-1).
Else if it is a hole border, increment NBD. Set (i2, j2) as (i, j+1) and LNBD = f_ij in case f_ij > 1.
Otherwise, go to step 3.

Step-2

Now, from this starting point, we will trace the border. This can be done as

Starting from (i2, j2) look around clockwise the pixels in the neighborhood of (i, j) and find a nonzero pixel and denote it as (i1, j1). If no nonzero pixels are found, set f_ij = -NBD and go to step 4.
Set (i2, j2) = (i1, j1) and (i3, j3) = (i,j).
Starting from the next element of the pixel (i2, j2) in the counterclockwise order, again traverse the neighborhood of the (i3, j3) in the counterclockwise direction to find the first nonzero pixel and set it to (i4, j4).
Change the value of the current pixel (i3, j3) as
1. if the pixel at (i3, j3 +1) is a 0-pixel belonging to the region outside the boundary, set the current pixel value to -NBD.
2. if the pixel at (i3, j3 +1) is not a 0-pixel and the current pixel value is 1, set the current pixel value to NBD.
3. Otherwise, do not change the current pixel value.
if in step 2.3, we return to the starting point again i.e (i4, j4) = (i, j) and (i3, j3) = (i1, j1) go to step 3. Otherwise, set (i2, j2) = (i3, j3) and (i3, j3) = (i4, j4) and go back to step 2.3.

Step-3

If f_ij != 1 then set LNBD = |f_ij| and start scanning from the next pixel (i, j+1). Stopping criteria is when we reached the bottom right corner of the image.

The below images shows step by step the result of one iteration of Suzuki’s algorithm on the above image.

Similarly repeating the above steps, we will get the following output. The hierarchy relationship among borders is also shown below.

They also proposed another algorithm that only extracts the outermost border. OpenCV supports both hierarchical and plane variants of the Suzuki algorithm. You can find the full code here.

References Paper: Topological structural analysis of digitized binary images by border following

So, that’s it for Suzuki’s algorithm. Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Contour Tracing

Leave a reply

In the previous blogs, we discussed various image segmentation methods which result in partitioning the image into sub-regions. Now, the next task is to represent and describe these regions in a form suitable further image processing tasks such as pattern classification or recognition, etc. One can represent these regions either in terms of the boundary (external feature) or in terms of the pixels comprising the regions (internal feature). So, in this blog, we will discuss one such representation known as Contours.

Contours in simple terms is a curve joining all the continuous points (along the boundary), having some similar property such as intensity. Once the contours are extracted, we can use them for shape analysis, and various object detection and recognition tasks, etc. So, let’s discuss different contour tracing (i.e. detecting the boundary of a region) algorithms. Some of the most common algorithms are

Square Tracing algorithm
Moore Boundary Tracing algorithm
Radial Sweep
Theo Pavlidis’ algorithm
Dr. Kovalevsky algorithm
Suzuki’s Algorithm (OpenCV)
Fast contour tracing algorithm

Square Tracing algorithm

This was one of the first approaches to extract contours and is quite simple. Suppose background is black (0’s) and object is white (1’s). Start iterating over the binary or segmented image row by row starting from left to right. If you detect white pixel (i.e. 1) go left otherwise go right. Here, left and right direction is subjective to how you entered that pixel. Stopping condition is if you entered the starting pixel a second time in the same manner you entered it initially. This works best with 4-connectivity as it only checks left and right and misses diagonal directions.

Moore Boundary Tracing algorithm

Start iterating row by row from left to right. Then traverse the 8-connected components of the object pixel found in the clockwise direction from the background pixel just before the object pixel. Stopping criteria is same as above. This removes the above method limitations.

Radial Sweep

This is similar to the Moore algorithm. After performing the first step of Moore algorithm, draw a line segment connecting the two object pixels found. Rotate this line segment in the clockwise direction until an object pixel is found in the 8-connectivity. Again draw the line segment and rotate. Stopping criteria is when you encounter the starting pixel, a second time, with the same next pixel. For a demonstration, please refer to this.

These are some of the few algorithms for contour tracing. In the next blog, we will discuss the Suzuki’s Algorithm one that OpenCV uses for finding and drawing contours. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

References: Wikipedia, Imageprocessingplace

Integral images

Leave a reply

In this blog, we will discuss the concept of integral images (or summed-area table, in general) that lets us efficiently compute the statistics like mean, standard deviation, etc in any rectangular window. This was introduced in 1984 by Frank Crow but this became popular due to its use in template matching and object detection (Source). So, let’s first discuss what is an integral image then discuss why it is efficient and how to compute the statistics from the integral image.

Integral image is obtained by summing all the pixels before each pixel (Naively you can think of this as similar to the cumulative distribution function where a particular value is obtained by summing all the values before). Let’s take an example to understand this.

Suppose we have a 5×5 binary image as shown below. The integral image is shown on the right.

All the pixels in the integral image are obtained by summing all the previous pixels. Previous here means all the pixels above and to the left of that pixel (inclusive of that pixel). For instance, the 3 (blue circle) is obtained by adding that pixel with the above and left pixels in the input image i.e. 1+0+0+1+0+0+0+1 = 3.

Finding the sum of pixels

Once the integral image is obtained, the sum of pixels in any rectangular region can be obtained in constant time (O(1) time complexity) by the following expression:

Sum = Bottom right + top left – top right – bottom left

For instance, the sum of all the pixels in the rectangular window can be obtained easily from the integral image using the above expression as shown below.

Here, top right (denoted by B) is 2, not 3. Be careful as we are finding the integral sum up to that point. For the ease of visualization, we can take a 4×4 window in the integral image and then perform the sum. For boundary pixels, pad with 0’s.

Now the mean can be calculated easily by dividing the sum by total pixels in that window. The standard deviation for any window can be obtained by the following formulae. This is obtained by simply expanding the variance formulae (See Wikipedia).

Here, S1 is the sum of the rectangular region in the input image and S2 is the sum of the square of that region in the input image and n is the no. of pixels in that region. Both S1 and S2 can be found out easily using the integral image. Now, let’s discuss how to implement this using OpenCV-Python. Let’s first discuss the builtin functions provided by OpenCV to calculate the integral image.

cv2.integral(src[, sdepth])

1	cv2.integral(src[, sdepth])

Here, src is the input image and sdepth is the optional argument denoting the depth of the integral image (must be of type CV_32S, CV_32F, or CV_64F). This returns an integral image which is of size (W+1)x(H+1) i.e. one more than the input image. Here, the first row and column of the integral image are all 0’s to deal with the boundary pixels as explained above. Rest all the pixels are obtained by summing all the previous pixels.

OpenCV also provides a function that returns the integral image of both the input image and its square. This can be done by the following function.

cv2.integral2(src[, sdepth[, sqdepth]])

1	cv2.integral2(src[, sdepth[, sqdepth]])

Here, sqdepth is the depth of the integral of the squared image (must be of type CV_32F, or CV_64F). This returns 2 arrays representing the integral of the input image and its square.

Calculate Standard deviation

Let’s verify that the standard deviation calculated by the above formulae yields correct results. For this, we will calculate the standard deviation using the builtin cv2.meanStdDev() function and then compare the results. Below is the code for this.

import cv2
import numpy as np
img = np.array([[0, 0, 0, 0, 0],
               [0, 0, 0, 0, 0],
               [0, 1, 1, 1, 0],
               [0, 1, 1, 1, 0],
               [0, 1, 1, 1, 0],
               [0, 0, 0, 0, 0]], dtype='uint8')

# Calculate the standard deviation
# Here I'm taking the full image, you can take any rectangular region
# Method-1: using cv2.meanStdDev()
mean, std_1 = cv2.meanStdDev(img, mask=None)

# Method-2: using the formulae 1/n(S2 - (S1**2)/n)
sum_1, sqsum_2 = cv2.integral2(img)
n = img.size
# sum of the region can be easily found out using the integral image as
#  Sum = Bottom right + top left - top right - bottom left
s1 = sum_1[-1,-1]
s2 = sqsum_2[-1,-1]
std_2 = np.sqrt((s2 - (s1**2)/n)/n)
print(std_1, std_2)  # [[0.45825757]] 0.4582575694

import cv2

import numpy as np

img = np.array([[0, 0, 0, 0, 0],

[0, 0, 0, 0, 0],

[0, 1, 1, 1, 0],

[0, 0, 0, 0, 0]], dtype='uint8')

# Calculate the standard deviation

# Here I'm taking the full image, you can take any rectangular region

# Method-1: using cv2.meanStdDev()

mean, std_1 = cv2.meanStdDev(img, mask=None)

# Method-2: using the formulae 1/n(S2 - (S1**2)/n)

sum_1, sqsum_2 = cv2.integral2(img)

n = img.size

# sum of the region can be easily found out using the integral image as

# Sum = Bottom right + top left - top right - bottom left

s1 = sum_1[-1,-1]

s2 = sqsum_2[-1,-1]

std_2 = np.sqrt((s2 - (s1**2)/n)/n)

print(std_1, std_2) # [[0.45825757]] 0.4582575694

Thus, calculating the integral image is a simple operation that lets us calculate the image statistics super-fast. Later we will learn how this can be very useful in template matching, face detection, etc. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Pyramids

Leave a reply

Image pyramid refers to the way of representing an image at multiple resolutions. The idea behind this is that features that may go undetected at one resolution can be easily detected at some other resolution. For instance, if the region of interest is large in size, a low-resolution image or coarse view is sufficient. While for small objects, it’s beneficial to examine them at high resolution. Now, if both large and small objects are present in an image, analyzing the image at several resolutions can prove beneficial. This is the main concept behind image pyramids. The name “pyramid” because if you place the high-resolution image at the bottom and stack subsequent low-resolution images on top, the appearance resembles that of a pyramid.

Thus constructing an image pyramid is equivalent to performing repeated smoothing and subsampling (reducing the size to half) an image. This is illustrated in the image below

Why blurring? Because this reduces the aliasing or ringing effects that may arise if we downsample directly. Now depending upon the type of blurring applied the pyramid is named. For instance, if we apply a mean filter, the pyramid is known as the mean pyramid, Gaussian filter – Gaussian pyramid and if we don’t apply any filtering, this is known as subsampling pyramid, etc. For subsampling, we can use any interpolation algorithm such as the nearest neighbor, bilinear, bicubic, etc. In this blog, we will discuss only two kinds of image pyramids

Gaussian Pyramid
Laplacian Pyramid

Gaussian pyramid involves applying repeated Gaussian blurring and downsampling an image until some stopping criteria are met. For instance, one of the stopping criteria can be the minimum image size. OpenCV provides a builtin function to perform blurring and downsampling as shown below

cv2.pyrDown(src[, dstsize[, borderType]])

1	cv2.pyrDown(src[, dstsize[, borderType]])

Here, src is the source image and rest are optional arguments which includes the output size (dstsize) and the border type. By default, the size of the output image is computed as Size((src.cols+1)/2, (src.rows+1)/2) i.e. the size is reduced to one-fourth at each step.

This function first convolves the input image with a 5×5 Gaussian kernel and then downsamples the image by rejecting even rows and columns. Below is an example of how to implement the above function.

import cv2
img = cv2.imread('D:/downloads/child.jpg')
img_level_1 = cv2.pyrDown(img)
img_level_2 = cv2.pyrDown(img_level_1)

import cv2

img = cv2.imread('D:/downloads/child.jpg')

img_level_1 = cv2.pyrDown(img)

img_level_2 = cv2.pyrDown(img_level_1)

Now, let’s discuss the Laplace pyramid. Since Laplacian is a high pass filter, so at each level of this pyramid, we will get an edge image as an output. As we have already discussed in the edge detection blog that the Laplacian can be approximated using the difference of Gaussian. So, here we will take advantage of this fact and obtain the Laplacian pyramid by subtracting the Gaussian pyramid levels. Thus the Laplacian of a level is obtained by subtracting that level in Gaussian Pyramid and expanded version of its upper level in Gaussian Pyramid. This is illustrated in the figure below.

OpenCV also provides a function to go down the image pyramid or expand a particular level as shown in the figure above.

cv2.pyrUp(src[, dstsize[, borderType]])

1	cv2.pyrUp(src[, dstsize[, borderType]])

This upsamples the input image by injecting even zero rows and columns and then convolves the result with the 5×5 Gaussian kernel multiplied by 4. By default, output image size is computed as Size(src.cols*2, (src.rows*2). Let’s take an example to illustrate the Laplacian pyramid.

Steps:

First load the image
Then construct the Gaussian pyramid with 3 levels.
For the Laplacian pyramid, the topmost level remains the same as in Gaussian. The remaining levels are constructed from top to bottom by subtracting that Gaussian level from its upper expanded level.

import cv2
# Load the image
img = cv2.imread('D:/downloads/child.jpg')
lower = img.copy()

# Create a Gaussian Pyramid
gaussian_pyr = [lower]
for i in range(3):
    lower = cv2.pyrDown(lower)
    gaussian_pyr.append(lower)

# Last level of Gaussian remains same in Laplacian
laplacian_top = gaussian_pyr[-1]

# Create a Laplacian Pyramid
laplacian_pyr = [laplacian_top]
for i in range(3,0,-1):
    size = (gaussian_pyr[i - 1].shape[1], gaussian_pyr[i - 1].shape[0])
    gaussian_expanded = cv2.pyrUp(gaussian_pyr[i], dstsize=size)
    laplacian = cv2.subtract(gaussian_pyr[i-1], gaussian_expanded)
    laplacian_pyr.append(laplacian)
    cv2.imshow('lap-{}'.format(i-1),laplacian)
    cv2.waitKey(0)

import cv2

# Load the image

img = cv2.imread('D:/downloads/child.jpg')

lower = img.copy()

# Create a Gaussian Pyramid

gaussian_pyr = [lower]

for i in range(3):

lower = cv2.pyrDown(lower)

gaussian_pyr.append(lower)

# Last level of Gaussian remains same in Laplacian

laplacian_top = gaussian_pyr[-1]

# Create a Laplacian Pyramid

laplacian_pyr = [laplacian_top]

for i in range(3,0,-1):

size = (gaussian_pyr[i - 1].shape[1], gaussian_pyr[i - 1].shape[0])

gaussian_expanded = cv2.pyrUp(gaussian_pyr[i], dstsize=size)

laplacian = cv2.subtract(gaussian_pyr[i-1], gaussian_expanded)

laplacian_pyr.append(laplacian)

cv2.imshow('lap-{}'.format(i-1),laplacian)

cv2.waitKey(0)

The Laplacian pyramid is mainly used for image compression. Image pyramids can also be used for image blending and for image enhancement which we will discuss in the next blog. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.