Comparing Histograms using OpenCV-Python

In the previous blogs, we discussed a lot about histograms. We learned histogram equalization, making a histogram to match a specified histogram, back project a histogram to find regions of interest and even used a histogram for performing image thresholding. In this blog, we will learn how to compare the histograms for the notion of similarity. This comparison is possible because we can classify a number of things around us based on color. We will learn various single number evaluation metrics that tell how well two histograms match with each other. So, let’s get started.

The histogram comparison methods can be classified into two categories

  • Bin-to-Bin comparison
  • Cross-bin comparison

Bin-to-Bin comparison methods include L1, L2 norm for calculating the bin distances or bin intersection, etc. These methods assume that the histogram domains are aligned but this condition is easily violated in most of the cases due to change in lighting conditions, quantization, etc. Cross bin comparison methods are more robust and discriminative but this can be computationally expensive. To circumvent this, one can reduce the cross bin comparison to bin-to-bin. Cross bin comparison methods include Earthmoving distance (EMD), quadratic form distances (taking into account the bin similarity matrix), etc.

OpenCV provides a builtin function for comparing the histograms as shown below.

Here, H1 and H2 are the histograms we want to compare and the “method” argument specifies the comparison method. OpenCV provides several built-in methods for histogram comparison as shown below

  • HISTCMP_CORREL: Correlation
  • HISTCMP _CHISQR: Chi-Square
  • HISTCMP _CHISQR_ALT: Alternative Chi-Square
  • HISTCMP _INTERSECT: Intersection
  • HISTCMP _BHATTACHARYYA: Bhattacharyya distance
  • HISTCMP _HELLINGER: Synonym for CV_COMP_BHATTACHARYYA
  • HISTCMP _KL_DIV: Kullback-Leibler divergence

For the Correlation and Intersection methods, the higher the metric, the more accurate the match. While for chi-square and Bhattacharyya, the lower metric value represents a more accurate match. Now, let’s take an example to understand how to use this function. Here, we will compare the two images as shown below.

Steps:

  • Load the images
  • Convert it into any suitable color model
  • Calculate the image histogram (2D or 3D histograms are better) and normalize it
  • Compare the histograms using the above function

The metric value comes out to be around 0.99 which seems to be pretty good. Try changing the bin sizes and the comparison methods and observe the change. In the next blog, we will discuss Earthmoving distance (EMD), a cross bin comparison method that is more robust as compared to these methods. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

1 thought on “Comparing Histograms using OpenCV-Python

  1. harsha

    how do you calculate the percentage of match between original and degraded image?
    how to over come below scenario while using histogram for comparison:
    – if both original image and degraded image are different. and still what if both have the same counts for R G and B bins. how do we compare those two images using histogram?

    Reply

Leave a Reply