Shape detection is an important area in the field of image processing. In the previous blog, we discussed how we can perform simple shape detection using contours. Since edge detection was used as a pre-processing step, so that approach was more susceptible to noise and missing edge points. To overcome this, in this blog, we will discuss Hough Transform, a popular method for detecting simple shapes that give robust shape detection under noise and partial occlusion. So, in this blog, let’s take an example of line detection and see how can we use the Hough transform to detect lines in an image. This method of line detection is also known as Hough Line Transform. So, let’s get started.
Hough Line Transform
Before going into detail, let’s first refresh some high school maths concepts that will be useful for understanding this.
We know that a line corresponds to a point in the parameter space as shown below. Why? because a line has a fixed slope and intercept value.
So, how this will help? Clearly, by changing the space, we reduced the line (consisting of many points) to a single point thus reducing storage and further computation.
Similarly, a point corresponds to a line in the parameter space as shown below. Why? because there can be infinite lines that can pass through a point and for each line we have a point in the parameter space.
Now, let’s combine the above two. Suppose you are given 2 points so can you find out where the line joining these 2 points map in the parameter space. Obviously, this would be equivalent to the intersection point as shown below.
This is what the Hough transform does. For each edge point, we draw the lines in the parameter space and then find their point of intersection (if any). The intersection points will give us the parameters (slope and intercept) of the line.
Problem
But there is one big problem with this parameter space representation. As you might have guessed that we can’t represent the vertical lines as this would require infinite m.
Solution
We use the polar or normal form of a line
Here, r is the perpendicular distance from the origin to the line and Θ is the angle formed by this perpendicular line with the origin as shown below.
You might ask isn’t r unbounded as this can take value from 0 to ∞. But since images are of finite size, thus r can take value from 0 to diagonal length of the image. Both r and Θ are finite and thus the above problem is resolved.
Now, let’s see what the line and point corresponds to in the (r,Θ) space.
A line in the (x,y) space still corresponds to a point in the (r, Θ) space. But a point in the (x,y) space is now equivalent to a sinusoidal curve in the (r, Θ) space as shown below.
Thus now for each edge pixel, we draw the sinusoidal curves in the (r, Θ) space and then find their point of intersection (if any).
Implementation
To find the point of intersection, Hough transform uses a voting method. In this, first, the parameter space (r, Θ) is discretized into bins as shown below. We call this an accumulator. Then for each edge pixel, we calculate the r value corresponding to each Θ value. And corresponding to each (r, Θ) value obtained, we increase the count of that (r, Θ) accumulator cell. Then find the bins with the highest value. Below is the pseudo-code of the Hough line transform algorithm.
Algorithm
- Initialize the accumulator (H) to all zeros
- For each edge pixel (x,y) in the image
- For Θ = 0 to 180
- Calculate r (r = x*cosΘ + y*sinΘ)
- H(Θ,r) = H(Θ,r) +1
- endFor
- For Θ = 0 to 180
- endFor
- Find the (Θ,r) value(s), where H(Θ,r) is above a suitable threshold value.
OpenCV
OpenCV provides a built-in function cv2.HoughLines(), that finds the lines in a binary image using the above algorithm. This takes as input the binary image, the size of the accumulator, the threshold value and outputs the array of (r, Θ) values. The syntax is given below.
1 2 3 4 5 6 |
lines = cv2.HoughLines(image, rho, theta, threshold) # image: 8-bit, single-channel binary source image # rho: Distance resolution of the accumulator in pixels. # theta: Angle resolution of the accumulator in radians. # threshold: Only those lines are returned that get enough votes |
Below is the code for this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import cv2 # Read the image img_orig = cv2.imread('D:/downloads/accumlator.JPG') img = img_orig.copy() # Convert to grayscale gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) # Find the edges using Canny detector edges = cv2.Canny(gray,50,200) # Apply the hough transform lines = cv2.HoughLines(edges,1,np.pi/180,200) # Draw the lines for line in lines: rho,theta = line[0] a = np.cos(theta) b = np.sin(theta) x0 = a*rho y0 = b*rho x1 = int(x0 + 1000*(-b)) y1 = int(y0 + 1000*(a)) x2 = int(x0 - 1000*(-b)) y2 = int(y0 - 1000*(a)) cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2) cv2.imshow('a', img) cv2.waitKey(0) |
Below is the result.
Limitations
- Choosing a good accumulator size is difficult.
- Too coarse: different lines will fall to a single bin
- Too fine: votes will fall to the neighboring bins for points which are not exactly collinear
- Time complexity increases exponentially wrt. parameters
That’s all for this blog. In the next blog, we will discuss how to detect some other shapes such as circle, ellipse, etc. Hope you enjoy reading.
If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.