Category Archives: Image Processing

Understanding Geometric Transformation: Translation using OpenCV-Python

In this blog, we will discuss image translation one of the most basic geometric transformations, that is performed on images. So, let’s get started.

Translation is simply the shifting of object location. Suppose we have a point P(x,y) which is translated by (tx, ty), then the coordinates after translation denoted by P'(x’,y’) are given by

So, we just need to create the transformation matrix (M) and then we can translate any point as shown above. That’s the basic idea behind translation. So, let’s first discuss how to do image translation using numpy for better understanding, and then we will see a more sophisticated implementation using OpenCV.

Numpy

First, let’s create the transformation matrix (M). This can be easily done using numpy as shown below. Here, the image is translated by (100, 50)

Next, let’s convert the image coordinates to the form [x,y,1]. This can be done as

Now apply the transformation by multiplying the transformation matrix with coordinates.

Keep only the coordinates that fall within the image boundary.

Now, create a zeros image similar to the original image and project all the points onto the new image.

Display the final image.

The full code can be found below

Below is the output. Here, left image represents the original image while the right one is the translated image.

OpenCV-Python

Now, let’s discuss how to translate images using OpenCV-Python.

OpenCV provides a function cv2.warpAffine() that applies an affine transformation to an image. You just need to provide the transformation matrix (M). The basic syntax for the function is given below.

Below is a sample code where the image is translated by (100, 50).

Below is the output. Here, left image represents the original image while the right one is the translated image.

Compare the outputs of both implementations. That’s all for image translation. In the next blog, we will discuss another geometric transformation known as rotation in detail. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Moments

In this blog, we will discuss how to find different features of contours such as area, centroid, orientation, etc. With the help of these features/statistics, we can do some sort of recognition. So, in this blog, we will refer to a very old fundamental work in computer vision known as Image moments that helps us to calculate these statistics. So, let’s first discuss what are image moments and how to calculate them.

In simple terms, image moments are a set of statistical parameters to measure the distribution of where the pixels are and their intensities. Mathematically, the image moment Mij of order (i,j) for a greyscale image with pixel intensities I(x,y) is calculated as

Here, x, y refers to the row and column index and I(x,y) refers to the intensity at that location (x,y). Now, let’s discuss how simple image properties are calculated from image moments.

Area:

For a binary image, the zeroth order moment corresponds to the area. Let’s discuss how?

Using the above formulae, the zeroth order moment (M00) is given by

For a binary image, this corresponds to counting all the non-zero pixels and that is equivalent to the area. For greyscale image, this corresponds to the sum of pixel intensity values.

Centroid:

Centroid simply is the arithmetic mean position of all the points. In terms of image moments, centroid is given by the relation

This is simple to understand. For instance, for a binary image M10 corresponds to the sum of all non-zero pixels (x-coordinate) and M00 is the total number of non-zero pixels and that is what the centroid is.

Let’s take a simple example to understand how to calculate image moments for a given image.

Below are the area and centroid calculation for the above image

OpenCV-Python

OpenCV provides a function cv2.moments() that outputs a dictionary containing all the moment values up to 3rd order.

Below is the sample code that shows how to use cv2.moments().

From this moments dictionary, we can easily extract the useful features such as area, centroid etc. as shown below.

That’s all about image moments. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Removing Text highlighter using Colorspace OpenCV-Python

Have you ever thought why a number of color models are available in OpenCV? Obviously, they might have some pros and cons. So, in this blog, we will discuss one such application of color models where we will learn to remove the highlighted area from the text.

Use Case:

This pre-processing step (removing text highlighter) can be quite useful before feeding the image to an OCR system. Otherwise, the OCR system will output erroneous results.

Problem Overview

Suppose we are given an image as shown on the left and we want to pre-process it to remove the highlighter from the text as shown by the right image below.

Approach:

Since we know that there are some color models available (such as HSV) where it is easy to represent the color as compared to the RGB model. So, we will convert the image from RGB to that colorspace and then remove the color information. For instance, in the HSV color model, H and S tell us about the chromaticity (color information) of the light while V carries the greyscale information. So in HSV, if we remove the H and S channel and only keep the V channel we can obtain the desired results.

Steps:

  • Read the highlighted text image
  • Convert from BGR to HSV colorspace using cv2.cvtColor()
  • Extract the V channel

Code:

So, you saw that just by changing the colorspace and extracting channels we obtained satisfactory results. We can further improve the results by applying other operations such as thresholding or morphological operations etc. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Finding Convex Hull OpenCV Python

In the previous blog, we discussed how to perform simple shape detection using contour approximation. In this blog, we will discuss how to find the convex hull of a given shape/curve. So, let’s first discuss what is a convex hull?

What is a Convex Hull?

Any region/shape is said to be convex if the line joining any two points (selected from the region) is contained entirely in that region. Another way of saying this is, for a shape to be convex, all of its interior angles must be less than 180 degrees or all the vertices should open towards the center. Let’s understand this with the help of the image below.

Convex vs concave

Now, for a given shape or set of points, we can have many convex curves/boundaries. The smallest or the tight-fitting convex boundary is known as a convex hull.

Convex Hull

Now, the next question that comes to our mind is how to find the convex hull for a given shape or set of points? There are so many algorithms for finding the convex hull. Some of the most common algorithms with their associated time complexities are shown below. Here, n is the no. of input points and h is the number of points on the hull.

OpenCV provides a builtin function for finding the convex hull of a point set as shown below

  • points: any contour or Input 2D point set whose convex hull we want to find.
  • clockwise: If it is True, the output convex hull is oriented clockwise. Otherwise, counter-clockwise.
  • returnPoints: If True (default) then returns the coordinates of the hull points. Otherwise, returns the indices of contour points corresponding to the hull points. Thus to find the actual hull coordinates in the second(False) case, we need to do contour[indices].

Now, let’s take an example and understand how to find the convex hull for a given image using OpenCV-Python.

sample image for finding Convex Hull

Steps:

  • Load the image
  • Convert it to greyscale
  • Threshold the image
  • Find the contours
  • For each contour, find the convex hull and draw it.

Below is the output of the above code.

Convex Hull output

Applications:

  • Collision detection or avoidance.
  • Face Swap
  • Shape analysis and many more.

Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Simple Shape Detection using Contour approximation

In the previous blog, we learned how to find and draw contours using OpenCV. In this blog, we will discuss how to detect simple geometric shapes by approximating the contours. So, let’s first discuss what is meant by contour approximation.

This means approximating a contour shape to another shape with less number of vertices so that the distance between both the shapes is less or equal to the specified precision. The below figure shows the curve approximation for different precisions (epsilon). See how the shape is approximated to a rectangle with epsilon =10% in the below image.

Contour approximation for different epsilon
Source: OpenCV

This is widely used in robotics for pattern classification and scene analysis. OpenCV provides a builtin function that approximates the polygonal curves with the specified precision. Its implementation is based on the Douglas-Peucker algorithm.

  • curve“: contour/polygon we want to approximate.
  • epsilon“: This is the maximum distance between the original curve and its approximation.
  • closed“: If true, the approximated curve is closed otherwise, not.

This function returns the approximated contour with the same type as that of the input curve. Now, let’s detect simple shapes using this concept. Let’s take the below image to perform shape detection.

Steps

  • Load the image and convert to greyscale
  • Apply thresholding and find contours
  • For each contour
    • First, approximate its shape using cv2.approxPolyDP()
    • if len(shape) == 3; shape is Triangle
    • else if len(shape) == 4; shape is Rectangle
    • else if len(shape) == 5; shape is Pentagon
    • else if 6< len(shape) <15; shape is Ellipse
    • else; shape is circle

Code

Below is the final result.

Contour approximation for shape detection

Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Find and Draw Contours using OpenCV-Python

In the previous blog, we discussed various contour tracing algorithms like Moore, radial sweep, etc. and then we also covered Suzuki’s algorithm, the one that OpenCV uses for border following or contour tracing. In this blog, we will discuss the builtin functions provided by OpenCV for finding and drawing contours. So, let’s get started.

Finding the contours

OpenCV provides the following builtin function for finding the contour

Here, the first argument “image” should be an 8-bit single-channel image. For better accuracy use a binary image. If you didn’t provide a binary image, then this method will convert it into a binary image by treating all the nonzero pixels as ‘1’ and zero remains ‘0’.

The second argument “mode” specifies how you want to retrieve the contours. This means whether you want to extract the outer contours only or retrieve contours without establishing any hierarchical relationships. Below are the several options available

  • cv2.RETR_EXTERNAL – retrieves only the extreme outer contours.
  • cv2.RETR_LIST – retrieves contours without establishing any hierarchical relationships.
  • cv2.RETR_TREE – constructs a full hierarchy of nested contours.
  • cv2.RETR_CCOMP – arranges all the contours into a 2-level hierarchy – outer contours and hole contours.

The third argument “method” denotes the contour approximation method. We don’t need to store all the points of a contour as the same thing can also be represented in a compact manner. For instance, a straight line can be represented by the endpoints. There is no need to store all the points as that would be redundant. OpenCV provides various options for this.

  • cv2.CHAIN_APPROX_NONE – stores all the boundary points.
  • cv2.CHAIN_APPROX_SIMPLE – removes all the redundant points and thus saves memory.
  • cv2.CHAIN_APPROX_TC89_L1 – applies one of the variants of the Teh-Chin chain approximation algorithm

The first output “contours” is a Python list of all the contours in the image. Each individual contour is a Numpy array of (x,y) coordinates of boundary points of the object.

The second output “hierarchy” represents the relationship among the contours like, is it a child of some other contour, or is it a parent, etc. OpenCV represents it as an array of four values : [Next, Previous, First_Child, Parent]

  • Next denotes the next contour at the same hierarchical level.”
  • Previous denotes the previous contour at the same hierarchical level.”
  • First_Child denotes its first child contour.”
  • Parent denotes index of its parent contour.”

Depending upon the contour retrieval mode argument hierarchy array can take different values. You can read more about it here.

Drawing the Contours

OpenCV provides the following builtin function for drawing the contour.

The first argument is the destination image on which to draw the contours, the second argument is the contours which should be passed as a Python list, the third argument is the index of contours that we want to draw(To draw all contours, pass -1). If the thickness ≥ 0, it draws contour outlines in the image otherwise, fills the area bounded by the contours. The optional argument hierarchy and the max-level specify up to which hierarchy level to draw the contours.

Now, let’s take an example to understand the above two functions.

Below is the output of the above code.

Contours OpenCV

Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Suzuki’s Contour tracing algorithm OpenCV-Python

In the previous blog, we discussed various contour tracing algorithms like radial sweep, Moore’s, etc. In this blog, we will discuss another famous contour tracing algorithm known as Suzuki’s algorithm. Many of the image processing libraries such as OpenCV uses this border following algorithm for the topological structural analysis of the image. This was one of the first algorithms that define the hierarchical relationships among the borders. This algorithm also differentiates between the outer boundary or the hole boundary. Before discussing this algorithm, let’s understand what is the outer and hole border. The below figure explains this pretty well. (Here we will be dealing with binary images (0 or 1)).

outer and hole border

Now, let’s understand the algorithm using the following image.

image to perform suzuki algorithm

Let’s say fij denotes the value of the pixel at location (i,j). The uppermost row, the lowermost row, the leftmost column, and the rightmost column of a picture compose its frame. In this, we assign a unique number to every new border found and we denote it by NBD. We assume the NBD of the frame as 1. Rest borders are numbered sequentially. We save the information of the parent of any border in LNBD or last NBD.

Steps:

  • Start scanning the image from left to right until you find the object pixel. Decide whether it is an outer border or hole border. The criteria for checking the outer or hole border is shown in the image below. Thus while scanning if we found the situation as shown in the image below, we can easily tell whether it is the starting point of the outer or the hole border.
Criteria for outer or hole border

Perform the following steps only for pixels >0. Every time we begin to scan a new row, reset LNBD to 1.

Step-1

  1. If it’s an outer border (i.e. fij = 1 and fi,j-1 = 0) then increment the NBD and set (i2, j2) as (i, j-1).
  2. Else if it is a hole border, increment NBD. Set (i2, j2) as (i, j+1) and LNBD = fij in case fij > 1.
  3. Otherwise, go to step 3.

Step-2

Now, from this starting point, we will trace the border. This can be done as

  1. Starting from (i2, j2) look around clockwise the pixels in the neighborhood of (i, j) and find a nonzero pixel and denote it as (i1, j1). If no nonzero pixels are found, set fij = -NBD and go to step 4.
  2. Set (i2, j2) = (i1, j1) and (i3, j3) = (i,j).
  3. Starting from the next element of the pixel (i2, j2) in the counterclockwise order, again traverse the neighborhood of the (i3, j3) in the counterclockwise direction to find the first nonzero pixel and set it to (i4, j4).
  4. Change the value of the current pixel (i3, j3) as
    1. if the pixel at (i3, j3 +1) is a 0-pixel belonging to the region outside the boundary, set the current pixel value to -NBD.
    2. if the pixel at (i3, j3 +1) is not a 0-pixel and the current pixel value is 1, set the current pixel value to NBD.
    3. Otherwise, do not change the current pixel value.
  5. if in step 2.3, we return to the starting point again i.e (i4, j4) = (i, j) and (i3, j3) = (i1, j1) go to step 3. Otherwise, set (i2, j2) = (i3, j3) and (i3, j3) = (i4, j4) and go back to step 2.3.

Step-3

If fij != 1 then set LNBD = |fij| and start scanning from the next pixel (i, j+1). Stopping criteria is when we reached the bottom right corner of the image.

The below images shows step by step the result of one iteration of Suzuki’s algorithm on the above image.

Step1 Suzuki algorithm
Step 2.1 Suzuki algorithm
Step 2.2 Suzuki algorithm
Step 2.3 Suzuki algorithm
Step 2.4.2 Suzuki algorithm
Step 2.5 Suzuki algorithm

Similarly repeating the above steps, we will get the following output. The hierarchy relationship among borders is also shown below.

Final output Suzuki algorithm

They also proposed another algorithm that only extracts the outermost border. OpenCV supports both hierarchical and plane variants of the Suzuki algorithm. You can find the full code here.

References Paper: Topological structural analysis of digitized binary images by border following

So, that’s it for Suzuki’s algorithm. Hope you enjoy reading.

If you have any doubts/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Contour Tracing

In the previous blogs, we discussed various image segmentation methods which result in partitioning the image into sub-regions. Now, the next task is to represent and describe these regions in a form suitable further image processing tasks such as pattern classification or recognition, etc. One can represent these regions either in terms of the boundary (external feature) or in terms of the pixels comprising the regions (internal feature). So, in this blog, we will discuss one such representation known as Contours.

Contours in simple terms is a curve joining all the continuous points (along the boundary), having some similar property such as intensity. Once the contours are extracted, we can use them for shape analysis, and various object detection and recognition tasks, etc. So, let’s discuss different contour tracing (i.e. detecting the boundary of a region) algorithms. Some of the most common algorithms are

Square Tracing algorithm

This was one of the first approaches to extract contours and is quite simple. Suppose background is black (0’s) and object is white (1’s). Start iterating over the binary or segmented image row by row starting from left to right. If you detect white pixel (i.e. 1) go left otherwise go right. Here, left and right direction is subjective to how you entered that pixel. Stopping condition is if you entered the starting pixel a second time in the same manner you entered it initially. This works best with 4-connectivity as it only checks left and right and misses diagonal directions.

Moore Boundary Tracing algorithm

Start iterating row by row from left to right. Then traverse the 8-connected components of the object pixel found in the clockwise direction from the background pixel just before the object pixel. Stopping criteria is same as above. This removes the above method limitations.

Radial Sweep

This is similar to the Moore algorithm. After performing the first step of Moore algorithm, draw a line segment connecting the two object pixels found. Rotate this line segment in the clockwise direction until an object pixel is found in the 8-connectivity. Again draw the line segment and rotate. Stopping criteria is when you encounter the starting pixel, a second time, with the same next pixel. For a demonstration, please refer to this.

These are some of the few algorithms for contour tracing. In the next blog, we will discuss the Suzuki’s Algorithm one that OpenCV uses for finding and drawing contours. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

References: Wikipedia, Imageprocessingplace

Integral images

In this blog, we will discuss the concept of integral images (or summed-area table, in general) that lets us efficiently compute the statistics like mean, standard deviation, etc in any rectangular window. This was introduced in 1984 by Frank Crow but this became popular due to its use in template matching and object detection (Source). So, let’s first discuss what is an integral image then discuss why it is efficient and how to compute the statistics from the integral image.

Integral image is obtained by summing all the pixels before each pixel (Naively you can think of this as similar to the cumulative distribution function where a particular value is obtained by summing all the values before). Let’s take an example to understand this.

Suppose we have a 5×5 binary image as shown below. The integral image is shown on the right.

All the pixels in the integral image are obtained by summing all the previous pixels. Previous here means all the pixels above and to the left of that pixel (inclusive of that pixel). For instance, the 3 (blue circle) is obtained by adding that pixel with the above and left pixels in the input image i.e. 1+0+0+1+0+0+0+1 = 3.

Finding the sum of pixels

Once the integral image is obtained, the sum of pixels in any rectangular region can be obtained in constant time (O(1) time complexity) by the following expression:

Sum = Bottom right + top left – top right – bottom left

For instance, the sum of all the pixels in the rectangular window can be obtained easily from the integral image using the above expression as shown below.

Here, top right (denoted by B) is 2, not 3. Be careful as we are finding the integral sum up to that point. For the ease of visualization, we can take a 4×4 window in the integral image and then perform the sum. For boundary pixels, pad with 0’s.

Now the mean can be calculated easily by dividing the sum by total pixels in that window. The standard deviation for any window can be obtained by the following formulae. This is obtained by simply expanding the variance formulae (See Wikipedia).

Here, S1 is the sum of the rectangular region in the input image and S2 is the sum of the square of that region in the input image and n is the no. of pixels in that region. Both S1 and S2 can be found out easily using the integral image. Now, let’s discuss how to implement this using OpenCV-Python. Let’s first discuss the builtin functions provided by OpenCV to calculate the integral image.

Here, src is the input image and sdepth is the optional argument denoting the depth of the integral image (must be of type CV_32S, CV_32F, or CV_64F). This returns an integral image which is of size (W+1)x(H+1) i.e. one more than the input image. Here, the first row and column of the integral image are all 0’s to deal with the boundary pixels as explained above. Rest all the pixels are obtained by summing all the previous pixels.

OpenCV also provides a function that returns the integral image of both the input image and its square. This can be done by the following function.

Here, sqdepth is the depth of the integral of the squared image (must be of type CV_32F, or CV_64F). This returns 2 arrays representing the integral of the input image and its square.

Calculate Standard deviation

Let’s verify that the standard deviation calculated by the above formulae yields correct results. For this, we will calculate the standard deviation using the builtin cv2.meanStdDev() function and then compare the results. Below is the code for this.

Thus, calculating the integral image is a simple operation that lets us calculate the image statistics super-fast. Later we will learn how this can be very useful in template matching, face detection, etc. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Pyramids

Image pyramid refers to the way of representing an image at multiple resolutions. The idea behind this is that features that may go undetected at one resolution can be easily detected at some other resolution. For instance, if the region of interest is large in size, a low-resolution image or coarse view is sufficient. While for small objects, it’s beneficial to examine them at high resolution. Now, if both large and small objects are present in an image, analyzing the image at several resolutions can prove beneficial. This is the main concept behind image pyramids. The name “pyramid” because if you place the high-resolution image at the bottom and stack subsequent low-resolution images on top, the appearance resembles that of a pyramid.

Thus constructing an image pyramid is equivalent to performing repeated smoothing and subsampling (reducing the size to half) an image. This is illustrated in the image below

Source: Wikipedia

Why blurring? Because this reduces the aliasing or ringing effects that may arise if we downsample directly. Now depending upon the type of blurring applied the pyramid is named. For instance, if we apply a mean filter, the pyramid is known as the mean pyramid, Gaussian filter – Gaussian pyramid and if we don’t apply any filtering, this is known as subsampling pyramid, etc. For subsampling, we can use any interpolation algorithm such as the nearest neighbor, bilinear, bicubic, etc. In this blog, we will discuss only two kinds of image pyramids

  • Gaussian Pyramid
  • Laplacian Pyramid

Gaussian pyramid involves applying repeated Gaussian blurring and downsampling an image until some stopping criteria are met. For instance, one of the stopping criteria can be the minimum image size. OpenCV provides a builtin function to perform blurring and downsampling as shown below

Here, src is the source image and rest are optional arguments which includes the output size (dstsize) and the border type. By default, the size of the output image is computed as Size((src.cols+1)/2, (src.rows+1)/2) i.e. the size is reduced to one-fourth at each step.

This function first convolves the input image with a 5×5 Gaussian kernel and then downsamples the image by rejecting even rows and columns. Below is an example of how to implement the above function.

Now, let’s discuss the Laplace pyramid. Since Laplacian is a high pass filter, so at each level of this pyramid, we will get an edge image as an output. As we have already discussed in the edge detection blog that the Laplacian can be approximated using the difference of Gaussian. So, here we will take advantage of this fact and obtain the Laplacian pyramid by subtracting the Gaussian pyramid levels. Thus the Laplacian of a level is obtained by subtracting that level in Gaussian Pyramid and expanded version of its upper level in Gaussian Pyramid. This is illustrated in the figure below.

OpenCV also provides a function to go down the image pyramid or expand a particular level as shown in the figure above.

This upsamples the input image by injecting even zero rows and columns and then convolves the result with the 5×5 Gaussian kernel multiplied by 4. By default, output image size is computed as Size(src.cols*2, (src.rows*2). Let’s take an example to illustrate the Laplacian pyramid.

Steps:

  • First load the image
  • Then construct the Gaussian pyramid with 3 levels.
  • For the Laplacian pyramid, the topmost level remains the same as in Gaussian. The remaining levels are constructed from top to bottom by subtracting that Gaussian level from its upper expanded level.

The Laplacian pyramid is mainly used for image compression. Image pyramids can also be used for image blending and for image enhancement which we will discuss in the next blog. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.