Tag Archives: opencv python

Write Text on images at mouse click position using OpenCV-Python

In the previous blog, we discussed how to write text on images in real-time. In that, we manually specified the position for text placement. This is quite tedious if we were to write text at multiple positions.

So, what if we automate this process. That is we automatically get the coordinates of the image where we click and then put text at that position using cv2.putText() function as we did in the previous blog.

This is what we will do in this blog i.e. write text on images at mouse click position. To do this, we will create a mouse callback function and then bind this function to the image window.

Mouse callback function is executed whenever a mouse event takes place. Mouse event refers to anything we do with the mouse like double click, left click etc. All available events can be found using the following code

Below is an example of a simple mouse callback function that draws a circle where we double click.

We then need to bind this callback function to the image window. This is done using
cv2.setMouseCallback(window_name, mouse_callback_function) as shown below

I hope you understood mouse callback function, now let’s get started

Steps:

  • Create a mouse callback function where on every left double click position we put text on the image.
  • Create or read an image.
  • Create an image window using cv2.namedWindow()
  • Bind the mouse callback function to the image window using cv2.setMouseCallback()
  • Display the new image using an infinite while loop

Code:

In the above code, press ‘q’ to stop writing and left double click anywhere to again start writing.

You can play with mouse callback function using other mouse events. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Creating a Snake Game using OpenCV-Python

Isn’t it interesting to create a snake game using OpenCV-Python? And what if I tell you that you only gonna need

  • cv2.imshow()
  • cv2.waitKey()
  • cv2.putText()
  • cv2.rectangle()

So, let’s get started.

Import Libraries

For this we only need four libraries

Displaying Game Objects

  • Game Window: Here, I have used a 500×500 image as my game window.
  • Snake and Apple: I have used green squares for displaying a snake and a red square for an apple. Each square has a size of 10 units.

Game Rules

Now, let’s define some game rules

  • Collision with boundaries: If the snake collides with the boundaries, it dies.
  • Collision with self: If the snake collides with itself, it should die. For this, we only need to check whether the snake’s head is in snake body or not.
  • Collision with apple: If the snake collides with the apple, the score is increased and the apple is moved to a new location.

Also, on eating apple snake length should increase. Otherwise, snake moves as it is.

  • Snake game has a fixed time for a keypress. If you press any button in that time, the snake should move in that direction otherwise continue moving in the previous direction. Sadly, with OpenCV cv2.waitKey() function, if you hold down the left direction button, the snake starts moving fast in that direction. So, to make the snake movement uniform, i did something like this.

Because cv2.waitKey() returns -1 when no key is pressed, so this ‘k’ stores the first key pressed in that time. Because the while loop is for a fixed time, so it doesn’t matter how fast you pressed a key. It will always wait a fixed time.

  • Snake cannot move backward: Here, I have used the w, a, s, d controls for moving the snake. If the snake was moving right and we pressed the left button, it will continue moving right or in short snake cannot directly move backwards.

After seeing which direction button is pressed, we change our head position

Displaying the final Score

For displaying the final score, i have used cv2.putText() function.

Finally, our snake game is ready and looks like this

The full code can be found here.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

2D Histogram

In the last blog, we discussed 1-D histograms in which we analyze each channel separately. Suppose, we want to find the correlation between image channels, let’s say we are interested in finding like how many times a (red, green) pair of (100,56) appeared in an image. In such a case, a 1-D histogram will fail as it does not shows the relationship of intensities at the exact position between two channels.

To solve this problem, we need Multi-dimensional histograms like 2-D or 3D. With the help of 2-D histograms, we can analyze the channels together in groups of 2 (RG, GB, BR) or all together with 3D histograms. Let’s see what is a 2-D histogram and how to construct this using OpenCV Python.

A 2-D histogram counts the occurrence of combinations of intensities. Below figure shows a 2D histogram

Here, Y and X-axis correspond to the Red and Green channel ranges( for 8-bit, [0,255]) and each point within the histogram shows the frequency corresponding to each R and G pair. Frequency is color-coded here, otherwise, another dimension would be needed.

Let’s understand how to construct a 2-D histogram by taking a simple example.

Suppose, we have 4×4, 2-bit images of Red and Green channels(as shown below) and we want to plot their 2-D histogram.

  • First, we plot the R and G channel ranges(Here, [0,3]) on the X and Y-axis respectively. This will be our 2-D histogram.
  • Then, loop over each position within the channels, find the corresponding intensity pairs frequency and plot it in the 2-D histogram. These frequencies are then color-coded for ease of visualization.

Now, let’s see how to construct a 2-D histogram using OpenCV-Python

We use the same function cv2.calcHist() that we have used for a 1-D histogram. Just change the following parameters and rest is the same.

  • channels: [0,1] for (Blue, Green), [1,2] for (G, R) and [0,2] for (B, R).
  • bins: specify for each channel according to your need. e.g [256,256].
  • range: [0,256,0,256] for an 8-bit image.

Below is the sample code for this using OpenCV-Python

Always use Nearest Neighbour Interpolation when plotting a 2-D histogram.

Plotting a 2-D histogram using RGB channels is not a good choice as we cannot extract color information using 2 channels only. Still, this can be used for finding the correlation between channels, finding clipping or intensity proportions etc.

To extract color information, we need a color model in which two components/channels can solely represent the chromaticity (color) of the image. One such color model is HSV where H and S tell us about the color of the light. So, first convert the image from BGR to HSV and then apply the above code.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Image Histograms

In this blog, we will discuss image histogram which is a must-have tool in your pocket. This will help in contrast enhancement, image segmentation, image compression, thresholding etc. Let’s see what is an image histogram and how to plot histogram using OpenCV and matplotlib.

What is an Image Histogram?

An image histogram tells us how the intensity values are distributed in an image. In this we plot the intensity values on the x-axis and the no. of pixels corresponding to intensity values on the y-axis. See the figure below.

This is called 1D histogram because we are taking only one feature into our consideration, i.e. greyscale intensity value of the pixel. In the next blog, we will discuss 2D histograms.

Now, let’s understand some terminologies associated with histogram

Tonal range refers to the region where most of the intensity values are present (See above figure). The left side represents the black and dark areas known as shadows, the middle represents medium grey or midtones and the right side represents light and pure white areas known as Highlights.

So, for a dark image the histogram will cover mostly the left side and center of the graph. While for a bright image, the histogram mostly rests on the right side and center of the graph as shown in the figure below

Now, let’s see how to plot the histogram for an image using OpenCV and matplotlib.

OpenCV: To calculate the image histogram, OpenCV provides the following function

cv2.calcHist(image, channel, mask, bins, range) 

  • image : input image, should be passed in a list. e.g. [image]
  • channel : index of the channel. for greyscale pass as [0], and for color image pass the desired channel as [0], [1], [2].
  • mask : provide if you want to calculate histogram for specific region otherwise pass None.
  • bins : No. of bins to use for each channel, should be passed as [256]
  • range : range of intensity values. For 8-bit pass as [0,256]

This returns a numpy.ndarray with shape (n_bins,1) which can then be plotted using matplotlib. Below is the code for this

Matplotlib: Unlike OpenCV, matplotlib directly finds the histogram and plots it using plt.hist()

For a color image, we can show each channel individually or we can first convert it into greyscale and then calculate the histogram. So, a color histogram can be expressed as “Three Intensity(Greyscale) Histograms”, each of which shows the brightness distribution of each individual Red/Green/Blue color channel. Below figure summarizes this.

Original Color Image

So, always see the histogram of the image before doing any other pre-processing operation. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Contrast Stretching

In the previous blog, we discussed the meaning of contrast in image processing, how to identify low and high contrast images and at last, we discussed the cause of low contrast in an image. In this blog, we will learn about the methods of contrast enhancement.

Below figure summarizes the Contrast Enhancement process pretty well.

Source: OpenCV

Depending upon the transformation function used, Contrast Enhancement methods can be divided into Linear and Non-Linear.

The linear method includes Contrast-Stretching transformation that uses Piecewise Linear functions while Non-linear method includes Histogram Equilisation, Gaussian Stretch etc. which uses Non-Linear transformation functions that are obtained automatically from the histogram of the input image.

In this blog, we will discuss only the Linear methods. Rest we will discuss in the next blogs.

Contrast stretching as the name suggests is an image enhancement technique that tries to improve the contrast by stretching the intensity values of an image to fill the entire dynamic range. The transformation function used is always linear and monotonically increasing.

Below figure shows a typical transformation function used for Contrast Stretching.

By changing the location of points (r1, s1) and (r2, s2), we can control the shape of the transformation function. For example,

  1. When r1 =s1 and r2=s2, transformation becomes a Linear function.
  2. When r1=r2, s1=0 and s2=L-1, transformation becomes a thresholding function.
  3. When (r1, s1) = (rmin, 0) and (r2, s2) = (rmax, L-1), this is known as Min-Max Stretching.
  4. When (r1, s1) = (rmin + c, 0) and (r2, s2) = (rmax – c, L-1), this is known as Percentile Stretching.

Let’s understand Min-Max and Percentile Stretching in detail.

In Min-Max Stretching, the lower and upper values of the input image are made to span the full dynamic range. In other words, Lower value of the input image is mapped to 0 and the upper value is mapped to 255. All other intermediate values are reassigned new intensity values according to the following formulae

Sometimes, when Min-Max is performed, the tail ends of the histogram becomes long resulting in no improvement in the image quality. So, it is better to clip a certain percentage like 1%, 2% of the data from the tail ends of the input image histogram. This is known as Percentile Stretching. The formulae is same as Min-Max but now the Xmax and Xmin are the clipped values.

Let’s understand Min-Max and Percentile Stretching with an example. Suppose we have an image whose histogram looks like this

Clearly, this histogram has a left tail with few values(around 70 to 120). So, when we apply Min-max Stretching, the result looks like this

Clearly, Min-Max stretching doesn’t improve the results much. Now, let’s apply Percentile Stretching

As we clipped the long tail of input histogram, Percentile stretching produces much superior results than the Min-max stretching.

Let’s see how to perform Min-Max Stretching using OpenCV-Python

For a color image, either change it into greyscale and then apply contrast stretching or change it into another color model like HSV and then apply contrast stretching on V. For percentile stretching, just change the min and max values with the clipped value. Rest all the code is the same.

So, always plot histogram and then decide which method to follow. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

What is Contrast in Image Processing?

According to Wikipedia, Contrast is the difference in luminance or color that makes an object distinguishable from other objects within the same field of view.

Take a look at the images shown below

Source: OpenCV

Clearly, the left image has a low contrast because it is difficult to identify the details present in the image as compared to the right image.

A real life example can be of a sunny and a foggy day. On a sunny day, everything looks clear to us, thus has a high contrast, as compared to a foggy day, where everything looks nearly of the same intensity (dull, washed-out grey look).

A more valid way to check whether an image has a low or high contrast is to plot the image histogram. Let’s plot the histogram for the above images

Clearly, from the left image histogram, we can see that the image intensity values are located in a narrow range. Because it’s hard to distinguish nearly the same intensity values (See below figure, 150 and 148 are hard to distinguish as compared to 50 and 200), thus the left image has low contrast.

The right histogram increases this gap between the intensity values and Whoo! the details in the image are now much more perceivable to us and thus yields a high contrast image.

So, for the high contrast, the image histogram should span the entire dynamic range as shown above by the right histogram. In the next blogs, we will learn different methods to do this.

There is another naive approach where we subtract the max and min intensity values and based on this difference we judge the image contrast. I will not recommend following this as this may get affected by the outliers (we will discuss in the next blogs). So, always plot the histogram to check.

Till now, we discussed contrast but we didn’t discuss the cause of low contrast images.

Low contrast images can result from Poor illumination, lack of dynamic range in the imaging sensor or even wrong setting of lens aperture during image acquisition etc.

When performing Contrast enhancement, you must first decide whether you want to do global or local contrast enhancement. Global means increasing the contrast of the whole image, While in local we divide the image into small regions and perform contrast enhancement on these regions independently. Don’t Worry, we will discuss these in detail in the next blogs.

This concept has been beautifully illustrated by the figure shown below( Taken from OpenCV Documentation)

Original Image

Clearly, on global enhancement, the details present on the face of the statue are lost. While these are preserved in the local enhancement. So you need to be careful when selecting these methods.

In the next blog, we will discuss the methods used to transform a low contrast image into a high contrast image. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Intensity-level Slicing

Intensity level slicing means highlighting a specific range of intensities in an image. In other words, we segment certain gray level regions from the rest of the image.

Suppose in an image, your region of interest always take value between say 80 to 150. So, intensity level slicing highlights this range and now instead of looking at the whole image, one can now focus on the highlighted region of interest.

Since, one can think of it as piecewise linear transformation function so this can be implemented in several ways. Here, we will discuss the two basic type of slicing that is more often used.

  • In the first type, we display the desired range of intensities in white and suppress all other intensities to black or vice versa. This results in a binary image. The transformation function for both the cases is shown below.
  • In the second type, we brighten or darken the desired range of intensities(a to b as shown below) and leave other intensities unchanged or vice versa. The transformation function for both the cases, first where the desired range is changed and second where it is unchanged, is shown below.

Let’s see how to do intensity level slicing using OpenCV-Python. Below code is for type 1 as discussed above

For color image, either you convert into greyscale or specify the minimum and maximum range as list of BGR values.

Applications: Mostly used for enhancing features in satellite and X-ray images.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Power Law (Gamma) Transformations

“Gamma Correction”, most of you might have heard this strange sounding thing. In this blog, we will see what it means and why does it matter to you?

The general form of Power law (Gamma) transformation function is

s = c*rγ

Where, ‘s’ and ‘r’ are the output and input pixel values, respectively and ‘c’ and γ are the positive constants. Like log transformation, power law curves with γ <1 map a narrow range of dark input values into a wider range of output values, with the opposite being true for higher input values. Similarly, for γ >1, we get the opposite result which is shown in the figure below

This is also known as gamma correction, gamma encoding or gamma compression. Don’t get confused.

The below curves are generated for r values normalized from 0 to 1. Then multiplied by the scaling constant c corresponding to the bit size used.

All the curves are scaled. Don’t get confused (See below)

But the main question is why we need this transformation, what’s the benefit of doing so?

To understand this, we first need to know how our eyes perceive light. The human perception of brightness follows an approximate power function(as shown below) according to Stevens’ power law for brightness perception.

See from the above figure, if we change input from 0 to 10, the output changes from 0 to 50 (approx.) but changing input from 240 to 255 does not really change the output value. This means that we are more sensitive to changes in dark as compared to bright. You may have realized it yourself as well!

But our camera does not work like this. Unlike human perception, camera follows a linear relationship. This means that if light falling on the camera is increased by 2 times, the output will also increase 2 folds. The camera curve looks like this

So, where and what is the actual problem?

The actual problem arises when we display the image.

You might be amazed to know that all display devices like your computer screen have Intensity to voltage response curve which is a power function with exponents(Gamma) varying from 1.8 to 2.5.

This means for any input signal(say from a camera), the output will be transformed by gamma (which is also known as Display Gamma) because of non-linear intensity to voltage relationship of the display screen. This results in images that are darker than intended.

To correct this, we apply gamma correction to the input signal(we know the intensity and voltage relationship we simply take the complement) which is known as Image Gamma. This gamma is automatically applied by the conversion algorithms like jpeg etc. thus the image looks normal to us.

This input cancels out the effects generated by the display and we see the image as it is. The whole procedure can be summed up as by the following figure

If images are not gamma-encoded, they allocate too many bits for the bright tones that humans cannot differentiate and too few bits for the dark tones. So, by gamma encoding, we remove this artifact.

Images which are not properly corrected can look either bleached out, or too dark.

Let’s verify by code that γ <1 produces images that are brighter while γ >1 results in images that are darker than intended

The output looks like this

Original Image
Gamma Encoded Images

I hope you understand Gamma encoding. In the next blog, we will discuss Contrast stretching, a Piecewise-linear transformation function in detail. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Creating Subplots in OpenCV-Python

In this blog, we will learn how to create subplots using OpenCV-Python. We know that cv2.imshow() only shows 1 image at a time. Displaying images side by side helps greatly in analyzing the result. Unlike Matlab, there is no direct function for creating subplots. But since OpenCV reads images as arrays, we can concatenate arrays using the inbuilt cv2.hconcat() and cv2.vconcat() functions. After that, we display this concatenated image using cv2.imshow().

cv2.hconcat([ img1, img2 ]) —– horizontally concatenated image as output. Same for cv2.vconcat().

Below is the sample code where i displayed 2 gamma corrected images using this method

The output looks like this

To put the text on images, use cv2.puttext() and if you want to leave spacing between the images shown, use cv2.copyMakeBorder(). You can play around with many other OpenCV functions.

Note: Array dimensions must match when using cv2.hconcat(). This means you cannot display color and greyscale images side by side using this method.

I hope this information will help you. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Bit-plane Slicing

You probably know that everything on a computer is stored as strings of bits. In Bit-plane slicing, we take the advantage of this fact to perform various image operations. Let’s see how.

I hope you have basic understanding of binary and decimal relationship.

For an 8-bit image, a pixel value of 0 is represented as 00000000 in binary form and 255 is encoded as 11111111. Here, the leftmost bit is known as the most significant bit (MSB) as it contributes the maximum. e.g. if MSB of 11111111 is changed to 0 (i.e. 01111111), then the value changes from 255 to 127. Similarly, rightmost bit is known as Least significant bit (LSB).

In Bit-plane slicing, we divide the image into bit planes. This is done by first converting the pixel values in the binary form and then dividing it into bit planes. Let’s see by an example.

For simplicity let’s take a 3×3, 3-bit image as shown below. We know that the pixel values for 3-bit can take values between 0 to 7.

Bit Plane Slicing

I hope you understand what is bit plane slicing and how it is preformed. Next Question that comes to mind is What’s the benefit of doing this?

Pros:

  • Image Compression (We will see later how we can construct nearly the original image using less number of bits).
  • Converting a gray level image to a binary image. In general, images reconstructed from bit planes is similar to applying some intensity transformation function to the original image. e.g. Image reconstructed from MSB is same as applying thresholding function to the original image. We will validate this in the code below.
  • Through this, we can analyze the relative importance of each bit in the image that will help in determining the number of bits used to quantize the image.

Let’s see how we can do this using OpenCV-Python

Code

The output looks like this

Original Image
8 bit planes (Top row – 8,7,6,5 ; bottom – 4,3,2,1 bit planes)

Clearly from the above figure, the last 4 bit planes do not seem to have much information in them.

Now, if we combine the 8,7,6,5 bit planes, we will get approximately the original image as shown below.

Image using 4 bit planes (8,7,6,5)

This can be done by the following code

Clearly, storing these 4 frames instead of the original image requires less space. Thus, it is used in Image Compression.

I hope you understand Bit plane slicing. If you find any other application of this, please let me know. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.