Tag Archives: histograms

Comparing Histograms using OpenCV-Python

In the previous blogs, we discussed a lot about histograms. We learned histogram equalization, making a histogram to match a specified histogram, back project a histogram to find regions of interest and even used a histogram for performing image thresholding. In this blog, we will learn how to compare the histograms for the notion of similarity. This comparison is possible because we can classify a number of things around us based on color. We will learn various single number evaluation metrics that tell how well two histograms match with each other. So, let’s get started.

The histogram comparison methods can be classified into two categories

  • Bin-to-Bin comparison
  • Cross-bin comparison

Bin-to-Bin comparison methods include L1, L2 norm for calculating the bin distances or bin intersection, etc. These methods assume that the histogram domains are aligned but this condition is easily violated in most of the cases due to change in lighting conditions, quantization, etc. Cross bin comparison methods are more robust and discriminative but this can be computationally expensive. To circumvent this, one can reduce the cross bin comparison to bin-to-bin. Cross bin comparison methods include Earthmoving distance (EMD), quadratic form distances (taking into account the bin similarity matrix), etc.

OpenCV provides a builtin function for comparing the histograms as shown below.

Here, H1 and H2 are the histograms we want to compare and the “method” argument specifies the comparison method. OpenCV provides several built-in methods for histogram comparison as shown below

  • HISTCMP_CORREL: Correlation
  • HISTCMP _CHISQR: Chi-Square
  • HISTCMP _CHISQR_ALT: Alternative Chi-Square
  • HISTCMP _INTERSECT: Intersection
  • HISTCMP _BHATTACHARYYA: Bhattacharyya distance
  • HISTCMP _HELLINGER: Synonym for CV_COMP_BHATTACHARYYA
  • HISTCMP _KL_DIV: Kullback-Leibler divergence

For the Correlation and Intersection methods, the higher the metric, the more accurate the match. While for chi-square and Bhattacharyya, the lower metric value represents a more accurate match. Now, let’s take an example to understand how to use this function. Here, we will compare the two images as shown below.

Steps:

  • Load the images
  • Convert it into any suitable color model
  • Calculate the image histogram (2D or 3D histograms are better) and normalize it
  • Compare the histograms using the above function

The metric value comes out to be around 0.99 which seems to be pretty good. Try changing the bin sizes and the comparison methods and observe the change. In the next blog, we will discuss Earthmoving distance (EMD), a cross bin comparison method that is more robust as compared to these methods. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Histogram Matching (Specification)

In the previous blog, we discussed Histogram Equalization that tries to produce an output image that has a uniform histogram. This approach is good but for some cases, this does not work well. One such case is when we have skewed image histogram i.e. large concentration of pixels at either end of greyscale.

One reasonable approach is to manually specify the transformation function that preserves the general shape of the original histogram but has a smoother transition of intensity levels in the skewed areas.

So, in this blog, we will learn how to transform an image so that its histogram matches a specified histogram. Also known as histogram matching or histogram Specification.

Histogram Equalization is a special case of histogram matching where the specified histogram is uniformly distributed.

First let’s understand the main idea behind histogram matching.

We will first equalize both original and specified histogram using the Histogram Equalization method. As we know that the transformation function is invertible, so by inverting we can get the mapping from original to specified histogram. The whole operation is shown in the below image

For example, suppose the pixel value 10 in the original image gets mapped to 20 in the equalized image. Then we will see what value in Specified image gets mapped to 20 in the equalized image and let’s say that this value is 28. So, we can say that 10 in the original image gets mapped to 28 in the specified image.

Most of you might be thinking why both original and specified histogram on equalization converges to same uniform histogram.

This is true only if we assume continuous intensity values. But in reality, the intensity values are discrete thus both original and specified histograms may not map to the same histogram on equalization. That’s why Histogram matching is not able to perfectly match the specified histogram.

Let’s take an example where we want to match the original image with the specified image, both histograms are shown below.

Here, I am taking the original image from the histogram equalization blog. All the steps of equalization are explained in this blog. Here, I will only show the final table

Original Image Histogram Equalization

Specified Image Histogram Equalization

After equalizing both the images, we need to perform a mapping from original to equalized to the specified image. For that, we need only the round columns of the original and specified image as shown below.

Pick one by one the values from the round column of the original image, find it in the round column of the specified image and note down the index. For example for 3 in the round original, we have 3 in the round specified column (with index 1) so we map it to 1.

If the value doesn’t exist then find the index of its nearest one. For example for 0 in round original, 1 is the nearest in round specified column (with index 0) so we map it to 0.

If multiple nearest values exist then pick the one which is greater than the value. For example for 2 in the round original, there are 2 closest values in round specified i.e. 1 and 3 so we pick 3 (with index 1) so we map it to 1.

After obtaining the Map column, replace the values in the original image with the map values. This is the final result.

The matched histogram(shown on left) approximately matches with the specified histogram(shown on right) as shown below

Now, let’s see how to perform Histogram matching using OpenCV-Python

Code

Note: Specified image can have different dimensions as compared to the original image.

The output looks like this

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

2D Histogram

In the last blog, we discussed 1-D histograms in which we analyze each channel separately. Suppose, we want to find the correlation between image channels, let’s say we are interested in finding like how many times a (red, green) pair of (100,56) appeared in an image. In such a case, a 1-D histogram will fail as it does not shows the relationship of intensities at the exact position between two channels.

To solve this problem, we need Multi-dimensional histograms like 2-D or 3D. With the help of 2-D histograms, we can analyze the channels together in groups of 2 (RG, GB, BR) or all together with 3D histograms. Let’s see what is a 2-D histogram and how to construct this using OpenCV Python.

A 2-D histogram counts the occurrence of combinations of intensities. Below figure shows a 2D histogram

Here, Y and X-axis correspond to the Red and Green channel ranges( for 8-bit, [0,255]) and each point within the histogram shows the frequency corresponding to each R and G pair. Frequency is color-coded here, otherwise, another dimension would be needed.

Let’s understand how to construct a 2-D histogram by taking a simple example.

Suppose, we have 4×4, 2-bit images of Red and Green channels(as shown below) and we want to plot their 2-D histogram.

  • First, we plot the R and G channel ranges(Here, [0,3]) on the X and Y-axis respectively. This will be our 2-D histogram.
  • Then, loop over each position within the channels, find the corresponding intensity pairs frequency and plot it in the 2-D histogram. These frequencies are then color-coded for ease of visualization.

Now, let’s see how to construct a 2-D histogram using OpenCV-Python

We use the same function cv2.calcHist() that we have used for a 1-D histogram. Just change the following parameters and rest is the same.

  • channels: [0,1] for (Blue, Green), [1,2] for (G, R) and [0,2] for (B, R).
  • bins: specify for each channel according to your need. e.g [256,256].
  • range: [0,256,0,256] for an 8-bit image.

Below is the sample code for this using OpenCV-Python

Always use Nearest Neighbour Interpolation when plotting a 2-D histogram.

Plotting a 2-D histogram using RGB channels is not a good choice as we cannot extract color information using 2 channels only. Still, this can be used for finding the correlation between channels, finding clipping or intensity proportions etc.

To extract color information, we need a color model in which two components/channels can solely represent the chromaticity (color) of the image. One such color model is HSV where H and S tell us about the color of the light. So, first convert the image from BGR to HSV and then apply the above code.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.