Tag Archives: sift

Introduction to SIFT (Scale-Invariant Feature Transform)

In the previous blogs, we discussed some corner detectors such as Harris Corner, Shi-Tomasi, etc. If you remember, these corner detectors were rotation invariant, which basically means, even if the image is rotated we would still be able to detect the same corners. This is obvious because corners remain corners in the rotated image also. But when it comes to scaling, these algorithms suffer and don’t give satisfactory results. This is obvious because if we scale the image, a corner may not remain a corner. Let’s understand this with the help of the following image (Source: OpenCV)

See on the left we have a corner in the small green window. But when this corner is zoomed (see on the right), it no longer remains a corner in the same window. So, this is the issue that scaling poses. I hope you understood this.

So, to solve this, in 2004, D.Lowe, University of British Columbia, in his paper, Distinctive Image Features from Scale-Invariant Keypoints came up with a new algorithm, Scale Invariant Feature Transform (SIFT). This algorithm not only detects the features but also describes them. And the best thing about these features is that these features are invariant to changes in

  • Scale
  • Rotation
  • Illumination (partially)
  • Viewpoint (partially)
  • Minor image artifacts/ Noise/ Blur

That’s why this was a breakthrough in this field at that time. So, you can use these features to perform different tasks such as object recognition, tracking, image stitching, etc, and don’t need to worry about scale, rotation, etc. Isn’t this cool and that too around 2004!!!

There are mainly four steps involved in SIFT algorithm to generate the set of image features

  • Scale-space extrema detection: As clear from the name, first we search over all scales and image locations(space) and determine the approximate location and scale of feature points (also known as keypoints). In the next blog, we will discuss how this is done but for now just remember that the first step simply finds the approximate location and scale of the keypoints
  • Keypoint localization: In this, we take the keypoints detected in the previous step and refine their location and scale to subpixel accuracy. For instance, if the approximate location is 17 then after refinement this may become 17.35 (more precise). Don’t worry we will discuss how this is done in the next blogs. After the refinement step, we discard bad keypoints such as edge points and the low contrast keypoints. So, after this step we get robust set of keypoints.
  • Orientation assignment: Then we calculate the orientation for each keypoint using its local neighborhood. All future operations are performed on image data that has been transformed relative to the assigned orientation, scale, and location for each feature, thereby providing invariance to these transformations.
  • Keypoint descriptor: All the previous steps ensured invariance to image location, scale and rotation. Finally we create the descriptor vector for each keypoint such that the descriptor is highly distinctive and partially invariant to the remaining variations such as illumination, 3D viewpoint, etc. This helps in uniquely identify features. Once you have obtained these features along with descriptors we can do whatever we want such as object recognition, tracking, stitching, etc. This sums up the SIFT algorithm on a coarser level.

Because SIFT is an extensive algorithm so we won’t be covering this in a single blog. We will understand each of these 4 steps in separate blogs and finally, we will implement this using OpenCV-Python. And as we will proceed, we will also understand how this algorithm achieves scale, rotation, illumination, and viewpoint invariance as discussed above.

So, in the next blog, let’s start with the scale-space extrema detection and understand this in detail. See you in the next blog. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Goodbye until next time.

Feature Detection, Description, and Matching

In the previous blogs, we discussed different segmentation algorithms such as watershed, grabcut, etc. From this blog, we will start another interesting topic known as Feature Detection, Description, and Matching. This has many applications in the field of computer vision such as image-stitching, object tracking, serving as the first step for many computer vision applications, etc. Over the past few decades, a number of algorithms has been proposed but before diving into these algorithms let’s first understand what in general are the features, and why are important. So, let’s get started.

What is a Feature?

According to Wikipedia, a feature is any piece of information that is relevant for solving any task. For instance, let’s say we have the task of identifying an apple in the image. So, the features useful in this case can be shape, color, texture, etc.

Now, that you know what features are, let’s try to understand which features are more important than others. For this, let’s take the example of image matching. Suppose you are given two images (see below) and your task is to match the rectangle present in the first image with the other. And, let’s say you are given 3 feature points A- flat area, B- edge, and C- corner. So now the question is, which of these is a better feature for matching the rectangle.

Clearly, A is a flat area. So, it’s difficult to find the exact location of this point in the other image. Thus, this is not a good feature point for matching. For B (edge), we can find the approximate location but not the accurate location. So, an edge is, therefore, a better feature compared to the flat area, but not good enough. But we can easily and accurately locate C (corner) in the other image and is thus is considered a good feature. So, corners are considered to be good features in an image. These feature points are also known as interest points.

What is a good feature or interest point?

A good feature or interest point is one that is robust to changes in illumination or brightness, scale and can be reliably computed with a high degree of repeatability. And also gives us enough knowledge about the task (see corner feature points for matching above). Also, a good feature should be unique, distinctive, and global.

So, I hope now you have some idea about the features. Now, let’s take a look at some of the applications of Feature Detection, Description, and Matching.

Applications

  • Object tracking
  • Image matching
  • Object Recognition
  • 3D object reconstruction
  • image stitching
  • Motion-based segmentation

All these applications follow the same general steps i.e. Feature Detection, Feature Description, and Feature Matching. All these steps are discussed below.

Steps

First, we detect all the feature points. This is known as Feature Detection. There are several algorithms developed for this such as

  • Harris Corner
  • SIFT(Scale Invariant Feature Transform)
  • SURF(Speeded Up Robust Feature)
  • FAST(Features from Accelerated Segment Test)
  • ORB(Oriented FAST and Rotated BRIEF)

We will discuss each of these algorithms in detail in the next blogs.

Then we describe each of these feature points. This is known as Feature Description. Suppose we have 2 images as shown below. Both of these contain corners. So, the question is are they the same or different.

Obviously, both are different as the first one contains a green area to the lower right while the other one has a green area to the upper right. So, basically what you did is you described both these features and that has led us to answer the question. Similarly, a computer also should describe the region around the feature so that it can find it in other images. So, this is the feature description. There are also several algorithms for this such as

  • SIFT(Scale Invariant Feature Transform)
  • SURF(Speeded Up Robust Feature)
  • BRISK (Binary Robust Invariant Scalable Keypoints)
  • BRIEF (Binary Robust Independent Elementary Features)
  • ORB(Oriented FAST and Rotated BRIEF)

As you might have noticed, that some of the above algorithms were also there in feature detection. These algorithms perform both feature detection and description. We will discuss each of these algorithms in detail in the next blogs.

Once we have the features and their descriptors, the next task is to match these features in the different images. This is known as Feature Matching. Below are some of the algorithms for this

  • Brute-Force Matcher
  • FLANN(Fast Library for Approximate Nearest Neighbors) Matcher

We will discuss each of these algorithms in detail in the next blogs. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Goodbye until next time.