Tag Archives: image pyramids opencv python

Image Pyramids

Image pyramid refers to the way of representing an image at multiple resolutions. The idea behind this is that features that may go undetected at one resolution can be easily detected at some other resolution. For instance, if the region of interest is large in size, a low-resolution image or coarse view is sufficient. While for small objects, it’s beneficial to examine them at high resolution. Now, if both large and small objects are present in an image, analyzing the image at several resolutions can prove beneficial. This is the main concept behind image pyramids. The name “pyramid” because if you place the high-resolution image at the bottom and stack subsequent low-resolution images on top, the appearance resembles that of a pyramid.

Thus constructing an image pyramid is equivalent to performing repeated smoothing and subsampling (reducing the size to half) an image. This is illustrated in the image below

Source: Wikipedia

Why blurring? Because this reduces the aliasing or ringing effects that may arise if we downsample directly. Now depending upon the type of blurring applied the pyramid is named. For instance, if we apply a mean filter, the pyramid is known as the mean pyramid, Gaussian filter – Gaussian pyramid and if we don’t apply any filtering, this is known as subsampling pyramid, etc. For subsampling, we can use any interpolation algorithm such as the nearest neighbor, bilinear, bicubic, etc. In this blog, we will discuss only two kinds of image pyramids

  • Gaussian Pyramid
  • Laplacian Pyramid

Gaussian pyramid involves applying repeated Gaussian blurring and downsampling an image until some stopping criteria are met. For instance, one of the stopping criteria can be the minimum image size. OpenCV provides a builtin function to perform blurring and downsampling as shown below

Here, src is the source image and rest are optional arguments which includes the output size (dstsize) and the border type. By default, the size of the output image is computed as Size((src.cols+1)/2, (src.rows+1)/2) i.e. the size is reduced to one-fourth at each step.

This function first convolves the input image with a 5×5 Gaussian kernel and then downsamples the image by rejecting even rows and columns. Below is an example of how to implement the above function.

Now, let’s discuss the Laplace pyramid. Since Laplacian is a high pass filter, so at each level of this pyramid, we will get an edge image as an output. As we have already discussed in the edge detection blog that the Laplacian can be approximated using the difference of Gaussian. So, here we will take advantage of this fact and obtain the Laplacian pyramid by subtracting the Gaussian pyramid levels. Thus the Laplacian of a level is obtained by subtracting that level in Gaussian Pyramid and expanded version of its upper level in Gaussian Pyramid. This is illustrated in the figure below.

OpenCV also provides a function to go down the image pyramid or expand a particular level as shown in the figure above.

This upsamples the input image by injecting even zero rows and columns and then convolves the result with the 5×5 Gaussian kernel multiplied by 4. By default, output image size is computed as Size(src.cols*2, (src.rows*2). Let’s take an example to illustrate the Laplacian pyramid.

Steps:

  • First load the image
  • Then construct the Gaussian pyramid with 3 levels.
  • For the Laplacian pyramid, the topmost level remains the same as in Gaussian. The remaining levels are constructed from top to bottom by subtracting that Gaussian level from its upper expanded level.

The Laplacian pyramid is mainly used for image compression. Image pyramids can also be used for image blending and for image enhancement which we will discuss in the next blog. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Blending using Image Pyramids

In the previous blog, we discussed image pyramids, and how to construct a Laplacian pyramid from the Gaussian. In this blog, we will discuss how image pyramids can be used for image blending. This produces more visually appealing results as compared to different blending methods we discussed until now. Below are the steps for image blending using image pyramids.

Steps:

  1. Load the two images and the mask.
  2. Find the Gaussian pyramid for the two images and the mask.
  3. From the Gaussian pyramid, calculate the Laplacian pyramid for the two images as explained in the previous blog.
  4. Now, blend each level of the Laplacian pyramid according to the mask image of the corresponding Gaussian level.
  5. From this blended Laplacian pyramid, reconstruct the original image. This is done by expanding the level and adding it to the below level as shown in the figure below. Here LS0, LS1, LS2, and LS3 are the levels of the blended Laplacian pyramid obtained in step 4.

Now, let’s implement the above steps using OpenCV-Python. Suppose we want to blend the two images corresponding to the mask as shown below.

Mask Image

So, we will clip the jet image from the second image and blend it to the first image. Below is the code for the steps explained above.

The blended output is shown below

Still, there is some amount of white gaze around the jet. Later, we will discuss gradient-domain blending methods which improve the result even more. Now, compare this image with a simple copy and paste operation and see the difference.

You can do a side-by-side blending also. In the next blog, we will discuss how to perform image enhancement and image compression using the Laplacian pyramids. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.