In this blog, we will discuss the concept of integral images (or summed-area table, in general) that lets us efficiently compute the statistics like mean, standard deviation, etc in any rectangular window. This was introduced in 1984 by Frank Crow but this became popular due to its use in template matching and object detection (Source). So, let’s first discuss what is an integral image then discuss why it is efficient and how to compute the statistics from the integral image.
Integral image is obtained by summing all the pixels before each pixel (Naively you can think of this as similar to the cumulative distribution function where a particular value is obtained by summing all the values before). Let’s take an example to understand this.
Suppose we have a 5×5 binary image as shown below. The integral image is shown on the right.
All the pixels in the integral image are obtained by summing all the previous pixels. Previous here means all the pixels above and to the left of that pixel (inclusive of that pixel). For instance, the 3 (blue circle) is obtained by adding that pixel with the above and left pixels in the input image i.e. 1+0+0+1+0+0+0+1 = 3.
Finding the sum of pixels
Once the integral image is obtained, the sum of pixels in any rectangular region can be obtained in constant time (O(1) time complexity) by the following expression:
Sum = Bottom right + top left – top right – bottom left
For instance, the sum of all the pixels in the rectangular window can be obtained easily from the integral image using the above expression as shown below.
Here, top right (denoted by B) is 2, not 3. Be careful as we are finding the integral sum up to that point. For the ease of visualization, we can take a 4×4 window in the integral image and then perform the sum. For boundary pixels, pad with 0’s.
Now the mean can be calculated easily by dividing the sum by total pixels in that window. The standard deviation for any window can be obtained by the following formulae. This is obtained by simply expanding the variance formulae (See Wikipedia).
Here, S1 is the sum of the rectangular region in the input image and S2 is the sum of the square of that region in the input image and n is the no. of pixels in that region. Both S1 and S2 can be found out easily using the integral image. Now, let’s discuss how to implement this using OpenCV-Python. Let’s first discuss the builtin functions provided by OpenCV to calculate the integral image.
1 |
cv2.integral(src[, sdepth]) |
Here, src is the input image and sdepth is the optional argument denoting the depth of the integral image (must be of type CV_32S, CV_32F, or CV_64F). This returns an integral image which is of size (W+1)x(H+1) i.e. one more than the input image. Here, the first row and column of the integral image are all 0’s to deal with the boundary pixels as explained above. Rest all the pixels are obtained by summing all the previous pixels.
OpenCV also provides a function that returns the integral image of both the input image and its square. This can be done by the following function.
1 |
cv2.integral2(src[, sdepth[, sqdepth]]) |
Here, sqdepth is the depth of the integral of the squared image (must be of type CV_32F, or CV_64F). This returns 2 arrays representing the integral of the input image and its square.
Calculate Standard deviation
Let’s verify that the standard deviation calculated by the above formulae yields correct results. For this, we will calculate the standard deviation using the builtin cv2.meanStdDev() function and then compare the results. Below is the code for this.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
import cv2 import numpy as np img = np.array([[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 1, 1, 1, 0], [0, 1, 1, 1, 0], [0, 1, 1, 1, 0], [0, 0, 0, 0, 0]], dtype='uint8') # Calculate the standard deviation # Here I'm taking the full image, you can take any rectangular region # Method-1: using cv2.meanStdDev() mean, std_1 = cv2.meanStdDev(img, mask=None) # Method-2: using the formulae 1/n(S2 - (S1**2)/n) sum_1, sqsum_2 = cv2.integral2(img) n = img.size # sum of the region can be easily found out using the integral image as # Sum = Bottom right + top left - top right - bottom left s1 = sum_1[-1,-1] s2 = sqsum_2[-1,-1] std_2 = np.sqrt((s2 - (s1**2)/n)/n) print(std_1, std_2) # [[0.45825757]] 0.4582575694 |
Thus, calculating the integral image is a simple operation that lets us calculate the image statistics super-fast. Later we will learn how this can be very useful in template matching, face detection, etc. Hope you enjoy reading.
If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.