Tag Archives: opencv python

Write Text on images at mouse click position using OpenCV-Python

Steps:

Create a mouse callback function where on every left double click position we put text on the image.
Create or read an image.
Create an image window using cv2.namedWindow()
Bind the mouse callback function to the image window using cv2.setMouseCallback()
Display the new image using an infinite while loop

Code:

import cv2
import numpy as np

font = cv2.FONT_HERSHEY_SIMPLEX
# mouse callback function
def draw_circle(event,x,y,flags,param):
    if event == cv2.EVENT_LBUTTONDBLCLK:
        i = 0
        while True:
            cv2.imshow('image',img) # to display the characters
            k = cv2.waitKey(0)
            cv2.putText(img, chr(k) , (x+i,y), font, 0.5, (0, 255, 0), 2, cv2.LINE_AA)
            i+=10
            # Press q to stop writing
            if k == ord('q'):
                break

    

# Create a black image, a window and bind the function to window
img = np.zeros((512,512,3), np.uint8)
cv2.namedWindow('image')
cv2.setMouseCallback('image',draw_circle)

while True:
    cv2.imshow('image',img)
    if cv2.waitKey(20) == 27:
        break
cv2.destroyAllWindows()

import cv2

import numpy as np

font = cv2.FONT_HERSHEY_SIMPLEX

# mouse callback function

def draw_circle(event,x,y,flags,param):

if event == cv2.EVENT_LBUTTONDBLCLK:

i = 0

while True:

cv2.imshow('image',img) # to display the characters

k = cv2.waitKey(0)

cv2.putText(img, chr(k) , (x+i,y), font, 0.5, (0, 255, 0), 2, cv2.LINE_AA)

i+=10

# Press q to stop writing

if k == ord('q'):

break

# Create a black image, a window and bind the function to window

img = np.zeros((512,512,3), np.uint8)

cv2.namedWindow('image')

cv2.setMouseCallback('image',draw_circle)

while True:

cv2.imshow('image',img)

if cv2.waitKey(20) == 27:

break

cv2.destroyAllWindows()

In the above code, press ‘q’ to stop writing and left double click anywhere to again start writing.

You can play with mouse callback function using other mouse events. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Creating a Snake Game using OpenCV-Python

Import Libraries

For this we only need four libraries

import numpy as np
import cv2
import random
import time

import numpy as np

import cv2

import random

import time

Displaying Game Objects

Game Window: Here, I have used a 500×500 image as my game window.

img = np.zeros((500,500,3),dtype='uint8')

1	img = np.zeros((500,500,3),dtype='uint8')

Snake and Apple: I have used green squares for displaying a snake and a red square for an apple. Each square has a size of 10 units.

# Displaying the snake (Green rectangles)
snake_position = [[250,250],[240,250],[230,250]]
for position in snake_position:
        cv2.rectangle(img,(position[0],position[1]),(position[0]+10,position[1]+10),(0,255,0),3)

# Displaying the snake (Green rectangles)

snake_position = [[250,250],[240,250],[230,250]]

for position in snake_position:

cv2.rectangle(img,(position[0],position[1]),(position[0]+10,position[1]+10),(0,255,0),3)

# Display apple (Red rectangles)
apple_position = [random.randrange(1,50)*10,random.randrange(1,50)*10]
cv2.rectangle(img,(apple_position[0],apple_position[1]),(apple_position[0]+10,apple_position[1]+10),(0,0,255),3)

# Display apple (Red rectangles)

apple_position = [random.randrange(1,50)*10,random.randrange(1,50)*10]

cv2.rectangle(img,(apple_position[0],apple_position[1]),(apple_position[0]+10,apple_position[1]+10),(0,0,255),3)

Game Rules

Now, let’s define some game rules

Collision with boundaries: If the snake collides with the boundaries, it dies.

def collision_with_boundaries(snake_head):
    if snake_head[0]>=500 or snake_head[0]<=0 or snake_head[1]>=500 or snake_head[1]<=0 :
        return 1
    else:
        return 0

def collision_with_boundaries(snake_head):

if snake_head[0]>=500 or snake_head[0]<=0 or snake_head[1]>=500 or snake_head[1]<=0 :

return 1

else:

return 0

Collision with self: If the snake collides with itself, it should die. For this, we only need to check whether the snake’s head is in snake body or not.

def collision_with_self(snake_position):
    snake_head = snake_position[0]
    if snake_head in snake_position[1:]:
        return 1
    else:
        return 0

def collision_with_self(snake_position):

snake_head = snake_position[0]

if snake_head in snake_position[1:]:

return 1

else:

return 0

Collision with apple: If the snake collides with the apple, the score is increased and the apple is moved to a new location.

def collision_with_apple(apple_position, score):
    apple_position = [random.randrange(1,50)*10,random.randrange(1,50)*10]
    score += 1
    return apple_position, score

def collision_with_apple(apple_position, score):

apple_position = [random.randrange(1,50)*10,random.randrange(1,50)*10]

score += 1

return apple_position, score

Also, on eating apple snake length should increase. Otherwise, snake moves as it is.

if snake_head == apple_position:
        apple_position, score = collision_with_apple(apple_position, score)
        snake_position.insert(0,list(snake_head))

    else:
        snake_position.insert(0,list(snake_head))
        snake_position.pop()

if snake_head == apple_position:

apple_position, score = collision_with_apple(apple_position, score)

snake_position.insert(0,list(snake_head))

else:

snake_position.insert(0,list(snake_head))

snake_position.pop()

Snake game has a fixed time for a keypress. If you press any button in that time, the snake should move in that direction otherwise continue moving in the previous direction. Sadly, with OpenCV cv2.waitKey() function, if you hold down the left direction button, the snake starts moving fast in that direction. So, to make the snake movement uniform, i did something like this.

    t_end = time.time() + 0.2
    k = -1
    while time.time() < t_end:
        if k == -1:
            k = cv2.waitKey(125)
        else:
            continue

t_end = time.time() + 0.2

k = -1

while time.time() < t_end:

if k == -1:

k = cv2.waitKey(125)

else:

continue

Because cv2.waitKey() returns -1 when no key is pressed, so this ‘k’ stores the first key pressed in that time. Because the while loop is for a fixed time, so it doesn’t matter how fast you pressed a key. It will always wait a fixed time.

Snake cannot move backward: Here, I have used the w, a, s, d controls for moving the snake. If the snake was moving right and we pressed the left button, it will continue moving right or in short snake cannot directly move backwards.

# 0-Left, 1-Right, 3-Up, 2-Down, q-Break
# a-Left, d-Right, w-Up, s-Down
if k == ord('a') and prev_button_direction != 1:
    button_direction = 0
elif k == ord('d') and prev_button_direction != 0:
    button_direction = 1
elif k == ord('w') and prev_button_direction != 2:
    button_direction = 3
elif k == ord('s') and prev_button_direction != 3:
    button_direction = 2
elif k == ord('q'):
    break
else:
    button_direction = button_direction

# 0-Left, 1-Right, 3-Up, 2-Down, q-Break

# a-Left, d-Right, w-Up, s-Down

if k == ord('a') and prev_button_direction != 1:

button_direction = 0

elif k == ord('d') and prev_button_direction != 0:

button_direction = 1

elif k == ord('w') and prev_button_direction != 2:

button_direction = 3

elif k == ord('s') and prev_button_direction != 3:

button_direction = 2

elif k == ord('q'):

break

else:

button_direction = button_direction

After seeing which direction button is pressed, we change our head position

if button_direction == 1:
    snake_head[0] += 10
elif button_direction == 0:
    snake_head[0] -= 10
elif button_direction == 2:
    snake_head[1] += 10
elif button_direction == 3:
    snake_head[1] -= 10

if button_direction == 1:

snake_head[0] += 10

elif button_direction == 0:

snake_head[0] -= 10

elif button_direction == 2:

snake_head[1] += 10

elif button_direction == 3:

snake_head[1] -= 10

Displaying the final Score

For displaying the final score, i have used cv2.putText() function.

if collision_with_boundaries(snake_head) == 1 or collision_with_self(snake_position) == 1:
    font = cv2.FONT_HERSHEY_SIMPLEX
    img = np.zeros((500,500,3),dtype='uint8')
    cv2.putText(img,'Your Score is {}'.format(score),(140,250), font, 1,(255,255,255),2,cv2.LINE_AA)
    cv2.imshow('a',img)
    cv2.waitKey(0)
    break

if collision_with_boundaries(snake_head) == 1 or collision_with_self(snake_position) == 1:

font = cv2.FONT_HERSHEY_SIMPLEX

img = np.zeros((500,500,3),dtype='uint8')

cv2.putText(img,'Your Score is {}'.format(score),(140,250), font, 1,(255,255,255),2,cv2.LINE_AA)

cv2.imshow('a',img)

cv2.waitKey(0)

break

Finally, our snake game is ready and looks like this

The full code can be found here.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

2D Histogram

1 Reply

In the last blog, we discussed 1-D histograms in which we analyze each channel separately. Suppose, we want to find the correlation between image channels, let’s say we are interested in finding like how many times a (red, green) pair of (100,56) appeared in an image. In such a case, a 1-D histogram will fail as it does not shows the relationship of intensities at the exact position between two channels.

To solve this problem, we need Multi-dimensional histograms like 2-D or 3D. With the help of 2-D histograms, we can analyze the channels together in groups of 2 (RG, GB, BR) or all together with 3D histograms. Let’s see what is a 2-D histogram and how to construct this using OpenCV Python.

A 2-D histogram counts the occurrence of combinations of intensities. Below figure shows a 2D histogram

Here, Y and X-axis correspond to the Red and Green channel ranges( for 8-bit, [0,255]) and each point within the histogram shows the frequency corresponding to each R and G pair. Frequency is color-coded here, otherwise, another dimension would be needed.

Let’s understand how to construct a 2-D histogram by taking a simple example.

Suppose, we have 4×4, 2-bit images of Red and Green channels(as shown below) and we want to plot their 2-D histogram.

First, we plot the R and G channel ranges(Here, [0,3]) on the X and Y-axis respectively. This will be our 2-D histogram.

Then, loop over each position within the channels, find the corresponding intensity pairs frequency and plot it in the 2-D histogram. These frequencies are then color-coded for ease of visualization.

Now, let’s see how to construct a 2-D histogram using OpenCV-Python

We use the same function cv2.calcHist() that we have used for a 1-D histogram. Just change the following parameters and rest is the same.

channels: [0,1] for (Blue, Green), [1,2] for (G, R) and [0,2] for (B, R).
bins: specify for each channel according to your need. e.g [256,256].
range: [0,256,0,256] for an 8-bit image.

Below is the sample code for this using OpenCV-Python

# 2D histogram for Blue and Green channels.
hist = cv2.calcHist([image], [0, 1], None, [256, 256], [0, 256, 0, 256])
# show using matplotlib
plt.imshow(hist, interpolation = 'nearest')
plt.show()

# 2D histogram for Blue and Green channels.

hist = cv2.calcHist([image], [0, 1], None, [256, 256], [0, 256, 0, 256])

# show using matplotlib

plt.imshow(hist, interpolation = 'nearest')

plt.show()

Always use Nearest Neighbour Interpolation when plotting a 2-D histogram.

Plotting a 2-D histogram using RGB channels is not a good choice as we cannot extract color information using 2 channels only. Still, this can be used for finding the correlation between channels, finding clipping or intensity proportions etc.

To extract color information, we need a color model in which two components/channels can solely represent the chromaticity (color) of the image. One such color model is HSV where H and S tell us about the color of the light. So, first convert the image from BGR to HSV and then apply the above code.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Image Histograms

7 Replies

In this blog, we will discuss image histogram which is a must-have tool in your pocket. This will help in contrast enhancement, image segmentation, image compression, thresholding etc. Let’s see what is an image histogram and how to plot histogram using OpenCV and matplotlib.

What is an Image Histogram?

An image histogram tells us how the intensity values are distributed in an image. In this we plot the intensity values on the x-axis and the no. of pixels corresponding to intensity values on the y-axis. See the figure below.

This is called 1D histogram because we are taking only one feature into our consideration, i.e. greyscale intensity value of the pixel. In the next blog, we will discuss 2D histograms.

Now, let’s understand some terminologies associated with histogram

Tonal range refers to the region where most of the intensity values are present (See above figure). The left side represents the black and dark areas known as shadows, the middle represents medium grey or midtones and the right side represents light and pure white areas known as Highlights.

So, for a dark image the histogram will cover mostly the left side and center of the graph. While for a bright image, the histogram mostly rests on the right side and center of the graph as shown in the figure below

Now, let’s see how to plot the histogram for an image using OpenCV and matplotlib.

OpenCV: To calculate the image histogram, OpenCV provides the following function

cv2.calcHist(image, channel, mask, bins, range)

image : input image, should be passed in a list. e.g. [image]
channel : index of the channel. for greyscale pass as [0], and for color image pass the desired channel as [0], [1], [2].
mask : provide if you want to calculate histogram for specific region otherwise pass None.
bins : No. of bins to use for each channel, should be passed as [256]
range : range of intensity values. For 8-bit pass as [0,256]

This returns a numpy.ndarray with shape (n_bins,1) which can then be plotted using matplotlib. Below is the code for this

import cv2
import matplotlib.pyplot as plt

# Load the image
image = cv2.imread('hist.jpg',0)

# Calculate histogram using cv2.calcHist()
hist = cv2.calcHist([image], [0], None, [256], [0,256])
# Display the histogram
plt.plot(hist)

import cv2

import matplotlib.pyplot as plt

# Load the image

image = cv2.imread('hist.jpg',0)

# Calculate histogram using cv2.calcHist()

hist = cv2.calcHist([image], [0], None, [256], [0,256])

# Display the histogram

plt.plot(hist)

Matplotlib: Unlike OpenCV, matplotlib directly finds the histogram and plots it using plt.hist()

plt.hist(image.flatten(), 256, [0,256])
plt.show()

1 2	plt.hist(image.flatten(), 256, [0,256]) plt.show()

For a color image, we can show each channel individually or we can first convert it into greyscale and then calculate the histogram. So, a color histogram can be expressed as “Three Intensity(Greyscale) Histograms”, each of which shows the brightness distribution of each individual Red/Green/Blue color channel. Below figure summarizes this.

So, always see the histogram of the image before doing any other pre-processing operation. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Contrast Stretching

3 Replies

In the previous blog, we discussed the meaning of contrast in image processing, how to identify low and high contrast images and at last, we discussed the cause of low contrast in an image. In this blog, we will learn about the methods of contrast enhancement.

Below figure summarizes the Contrast Enhancement process pretty well.

Depending upon the transformation function used, Contrast Enhancement methods can be divided into Linear and Non-Linear.

The linear method includes Contrast-Stretching transformation that uses Piecewise Linear functions while Non-linear method includes Histogram Equilisation, Gaussian Stretch etc. which uses Non-Linear transformation functions that are obtained automatically from the histogram of the input image.

In this blog, we will discuss only the Linear methods. Rest we will discuss in the next blogs.

Contrast stretching as the name suggests is an image enhancement technique that tries to improve the contrast by stretching the intensity values of an image to fill the entire dynamic range. The transformation function used is always linear and monotonically increasing.

Below figure shows a typical transformation function used for Contrast Stretching.

By changing the location of points (r1, s1) and (r2, s2), we can control the shape of the transformation function. For example,

When r1 =s1 and r2=s2, transformation becomes a Linear function.
When r1=r2, s1=0 and s2=L-1, transformation becomes a thresholding function.
When (r1, s1) = (r_min, 0) and (r2, s2) = (r_max, L-1), this is known as Min-Max Stretching.
When (r1, s1) = (r_min + c, 0) and (r2, s2) = (r_max – c, L-1), this is known as Percentile Stretching.

Let’s understand Min-Max and Percentile Stretching in detail.

In Min-Max Stretching, the lower and upper values of the input image are made to span the full dynamic range. In other words, Lower value of the input image is mapped to 0 and the upper value is mapped to 255. All other intermediate values are reassigned new intensity values according to the following formulae

Sometimes, when Min-Max is performed, the tail ends of the histogram becomes long resulting in no improvement in the image quality. So, it is better to clip a certain percentage like 1%, 2% of the data from the tail ends of the input image histogram. This is known as Percentile Stretching. The formulae is same as Min-Max but now the X_max and X_min are the clipped values.

Let’s understand Min-Max and Percentile Stretching with an example. Suppose we have an image whose histogram looks like this

Clearly, this histogram has a left tail with few values(around 70 to 120). So, when we apply Min-max Stretching, the result looks like this

Clearly, Min-Max stretching doesn’t improve the results much. Now, let’s apply Percentile Stretching

As we clipped the long tail of input histogram, Percentile stretching produces much superior results than the Min-max stretching.

Let’s see how to perform Min-Max Stretching using OpenCV-Python

import cv2
import numpy as np

# Read the image
img1 = cv2.imread('D:/downloads/contrast.PNG',0)

# Create zeros array to store the stretched image
minmax_img = np.zeros((img1.shape[0],img1.shape[1]),dtype = 'uint8')

# Loop over the image and apply Min-Max formulae
for i in range(img1.shape[0]):
    for j in range(img1.shape[1]):
        minmax_img[i,j] = 255*(img1[i,j]-np.min(img1))/(np.max(img1)-np.min(img1))

# Displat the stretched image
cv2.imshow('Minmax',minmax_img)
cv2.waitKey(0)

import cv2

import numpy as np

# Read the image

img1 = cv2.imread('D:/downloads/contrast.PNG',0)

# Create zeros array to store the stretched image

minmax_img = np.zeros((img1.shape[0],img1.shape[1]),dtype = 'uint8')

# Loop over the image and apply Min-Max formulae

for i in range(img1.shape[0]):

for j in range(img1.shape[1]):

minmax_img[i,j] = 255*(img1[i,j]-np.min(img1))/(np.max(img1)-np.min(img1))

# Displat the stretched image

cv2.imshow('Minmax',minmax_img)

cv2.waitKey(0)

For a color image, either change it into greyscale and then apply contrast stretching or change it into another color model like HSV and then apply contrast stretching on V. For percentile stretching, just change the min and max values with the clipped value. Rest all the code is the same.

So, always plot histogram and then decide which method to follow. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

What is Contrast in Image Processing?

4 Replies

According to Wikipedia, Contrast is the difference in luminance or color that makes an object distinguishable from other objects within the same field of view.

Take a look at the images shown below

Clearly, the left image has a low contrast because it is difficult to identify the details present in the image as compared to the right image.

A real life example can be of a sunny and a foggy day. On a sunny day, everything looks clear to us, thus has a high contrast, as compared to a foggy day, where everything looks nearly of the same intensity (dull, washed-out grey look).

A more valid way to check whether an image has a low or high contrast is to plot the image histogram. Let’s plot the histogram for the above images

Clearly, from the left image histogram, we can see that the image intensity values are located in a narrow range. Because it’s hard to distinguish nearly the same intensity values (See below figure, 150 and 148 are hard to distinguish as compared to 50 and 200), thus the left image has low contrast.

The right histogram increases this gap between the intensity values and Whoo! the details in the image are now much more perceivable to us and thus yields a high contrast image.

So, for the high contrast, the image histogram should span the entire dynamic range as shown above by the right histogram. In the next blogs, we will learn different methods to do this.

There is another naive approach where we subtract the max and min intensity values and based on this difference we judge the image contrast. I will not recommend following this as this may get affected by the outliers (we will discuss in the next blogs). So, always plot the histogram to check.

Till now, we discussed contrast but we didn’t discuss the cause of low contrast images.

Low contrast images can result from Poor illumination, lack of dynamic range in the imaging sensor or even wrong setting of lens aperture during image acquisition etc.

When performing Contrast enhancement, you must first decide whether you want to do global or local contrast enhancement. Global means increasing the contrast of the whole image, While in local we divide the image into small regions and perform contrast enhancement on these regions independently. Don’t Worry, we will discuss these in detail in the next blogs.

This concept has been beautifully illustrated by the figure shown below( Taken from OpenCV Documentation)

Clearly, on global enhancement, the details present on the face of the statue are lost. While these are preserved in the local enhancement. So you need to be careful when selecting these methods.

In the next blog, we will discuss the methods used to transform a low contrast image into a high contrast image. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Intensity-level Slicing

1 Reply

Intensity level slicing means highlighting a specific range of intensities in an image. In other words, we segment certain gray level regions from the rest of the image.

Suppose in an image, your region of interest always take value between say 80 to 150. So, intensity level slicing highlights this range and now instead of looking at the whole image, one can now focus on the highlighted region of interest.

Since, one can think of it as piecewise linear transformation function so this can be implemented in several ways. Here, we will discuss the two basic type of slicing that is more often used.

In the first type, we display the desired range of intensities in white and suppress all other intensities to black or vice versa. This results in a binary image. The transformation function for both the cases is shown below.

In the second type, we brighten or darken the desired range of intensities(a to b as shown below) and leave other intensities unchanged or vice versa. The transformation function for both the cases, first where the desired range is changed and second where it is unchanged, is shown below.

Let’s see how to do intensity level slicing using OpenCV-Python. Below code is for type 1 as discussed above

import cv2
import numpy as np
# Load the image
img = cv2.imread('D:/downloads/forest.jpg',0)
# Find width and height of image
row, column = img.shape
# Create an zeros array to store the sliced image
img1 = np.zeros((row,column),dtype = 'uint8')

# Specify the min and max range
min_range = 10
max_range = 60

# Loop over the input image and if pixel value lies in desired range set it to 255 otherwise set it to 0.
for i in range(row):
    for j in range(column):
        if img[i,j]>min_range and img[i,j]<max_range:
            img1[i,j] = 255
        else:
            img1[i,j] = 0
# Display the image
cv2.imshow('sliced image', img1)
cv2.waitKey(0)

import cv2

import numpy as np

# Load the image

img = cv2.imread('D:/downloads/forest.jpg',0)

# Find width and height of image

row, column = img.shape

# Create an zeros array to store the sliced image

img1 = np.zeros((row,column),dtype = 'uint8')

# Specify the min and max range

min_range = 10

max_range = 60

# Loop over the input image and if pixel value lies in desired range set it to 255 otherwise set it to 0.

for i in range(row):

for j in range(column):

if img[i,j]>min_range and img[i,j]<max_range:

img1[i,j] = 255

else:

img1[i,j] = 0

# Display the image

cv2.imshow('sliced image', img1)

cv2.waitKey(0)

For color image, either you convert into greyscale or specify the minimum and maximum range as list of BGR values.

Applications: Mostly used for enhancing features in satellite and X-ray images.

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Power Law (Gamma) Transformations

7 Replies

“Gamma Correction”, most of you might have heard this strange sounding thing. In this blog, we will see what it means and why does it matter to you?

The general form of Power law (Gamma) transformation function is

s = c*r^γ

Where, ‘s’ and ‘r’ are the output and input pixel values, respectively and ‘c’ and γ are the positive constants. Like log transformation, power law curves with γ <1 map a narrow range of dark input values into a wider range of output values, with the opposite being true for higher input values. Similarly, for γ >1, we get the opposite result which is shown in the figure below

This is also known as gamma correction, gamma encoding or gamma compression. Don’t get confused.

The below curves are generated for r values normalized from 0 to 1. Then multiplied by the scaling constant c corresponding to the bit size used.

**All the curves are scaled. Don’t get confused (See below)**

But the main question is why we need this transformation, what’s the benefit of doing so?

To understand this, we first need to know how our eyes perceive light. The human perception of brightness follows an approximate power function(as shown below) according to Stevens’ power law for brightness perception.

See from the above figure, if we change input from 0 to 10, the output changes from 0 to 50 (approx.) but changing input from 240 to 255 does not really change the output value. This means that we are more sensitive to changes in dark as compared to bright. You may have realized it yourself as well!

But our camera does not work like this. Unlike human perception, camera follows a linear relationship. This means that if light falling on the camera is increased by 2 times, the output will also increase 2 folds. The camera curve looks like this

So, where and what is the actual problem?

The actual problem arises when we display the image.

You might be amazed to know that all display devices like your computer screen have Intensity to voltage response curve which is a power function with exponents(Gamma) varying from 1.8 to 2.5.

This means for any input signal(say from a camera), the output will be transformed by gamma (which is also known as Display Gamma) because of non-linear intensity to voltage relationship of the display screen. This results in images that are darker than intended.

To correct this, we apply gamma correction to the input signal(we know the intensity and voltage relationship we simply take the complement) which is known as Image Gamma. This gamma is automatically applied by the conversion algorithms like jpeg etc. thus the image looks normal to us.

This input cancels out the effects generated by the display and we see the image as it is. The whole procedure can be summed up as by the following figure

If images are not gamma-encoded, they allocate too many bits for the bright tones that humans cannot differentiate and too few bits for the dark tones. So, by gamma encoding, we remove this artifact.

Images which are not properly corrected can look either bleached out, or too dark.

Let’s verify by code that γ <1 produces images that are brighter while γ >1 results in images that are darker than intended

import numpy as np
import cv2
# Load the image
img = cv2.imread('D:/downloads/forest.jpg')
# Apply Gamma=2.2 on the normalised image and then multiply by scaling constant (For 8 bit, c=255)
gamma_two_point_two = np.array(255*(img/255)**2.2,dtype='uint8')
# Similarly, Apply Gamma=0.4 
gamma_point_four = np.array(255*(img/255)**0.4,dtype='uint8')
# Display the images in subplots
img3 = cv2.hconcat([gamma_two_point_two,gamma_point_four])
cv2.imshow('a2',img3)
cv2.waitKey(0)

import numpy as np

import cv2

# Load the image

img = cv2.imread('D:/downloads/forest.jpg')

# Apply Gamma=2.2 on the normalised image and then multiply by scaling constant (For 8 bit, c=255)

gamma_two_point_two = np.array(255*(img/255)**2.2,dtype='uint8')

# Similarly, Apply Gamma=0.4

gamma_point_four = np.array(255*(img/255)**0.4,dtype='uint8')

# Display the images in subplots

img3 = cv2.hconcat([gamma_two_point_two,gamma_point_four])

cv2.imshow('a2',img3)

cv2.waitKey(0)

The output looks like this

I hope you understand Gamma encoding. In the next blog, we will discuss Contrast stretching, a Piecewise-linear transformation function in detail. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Creating Subplots in OpenCV-Python

Bit-plane Slicing

6 Replies

You probably know that everything on a computer is stored as strings of bits. In Bit-plane slicing, we take the advantage of this fact to perform various image operations. Let’s see how.

I hope you have basic understanding of binary and decimal relationship.

For an 8-bit image, a pixel value of 0 is represented as 00000000 in binary form and 255 is encoded as 11111111. Here, the leftmost bit is known as the most significant bit (MSB) as it contributes the maximum. e.g. if MSB of 11111111 is changed to 0 (i.e. 01111111), then the value changes from 255 to 127. Similarly, rightmost bit is known as Least significant bit (LSB).

In Bit-plane slicing, we divide the image into bit planes. This is done by first converting the pixel values in the binary form and then dividing it into bit planes. Let’s see by an example.

For simplicity let’s take a 3×3, 3-bit image as shown below. We know that the pixel values for 3-bit can take values between 0 to 7.

I hope you understand what is bit plane slicing and how it is preformed. Next Question that comes to mind is What’s the benefit of doing this?

Pros:

Image Compression (We will see later how we can construct nearly the original image using less number of bits).
Converting a gray level image to a binary image. In general, images reconstructed from bit planes is similar to applying some intensity transformation function to the original image. e.g. Image reconstructed from MSB is same as applying thresholding function to the original image. We will validate this in the code below.
Through this, we can analyze the relative importance of each bit in the image that will help in determining the number of bits used to quantize the image.

Let’s see how we can do this using OpenCV-Python

Code

import numpy as np
import cv2
# Read the image in greyscale
img = cv2.imread('D:/Downloads/reference.jpg',0)

#Iterate over each pixel and change pixel value to binary using np.binary_repr() and store it in a list.
lst = []
for i in range(img.shape[0]):
    for j in range(img.shape[1]):
         lst.append(np.binary_repr(img[i][j] ,width=8)) # width = no. of bits

# We have a list of strings where each string represents binary pixel value. To extract bit planes we need to iterate over the strings and store the characters corresponding to bit planes into lists.
# Multiply with 2^(n-1) and reshape to reconstruct the bit image.
eight_bit_img = (np.array([int(i[0]) for i in lst],dtype = np.uint8) * 128).reshape(img.shape[0],img.shape[1])
seven_bit_img = (np.array([int(i[1]) for i in lst],dtype = np.uint8) * 64).reshape(img.shape[0],img.shape[1])
six_bit_img = (np.array([int(i[2]) for i in lst],dtype = np.uint8) * 32).reshape(img.shape[0],img.shape[1])
five_bit_img = (np.array([int(i[3]) for i in lst],dtype = np.uint8) * 16).reshape(img.shape[0],img.shape[1])
four_bit_img = (np.array([int(i[4]) for i in lst],dtype = np.uint8) * 8).reshape(img.shape[0],img.shape[1])
three_bit_img = (np.array([int(i[5]) for i in lst],dtype = np.uint8) * 4).reshape(img.shape[0],img.shape[1])
two_bit_img = (np.array([int(i[6]) for i in lst],dtype = np.uint8) * 2).reshape(img.shape[0],img.shape[1])
one_bit_img = (np.array([int(i[7]) for i in lst],dtype = np.uint8) * 1).reshape(img.shape[0],img.shape[1])

#Concatenate these images for ease of display using cv2.hconcat()
finalr = cv2.hconcat([eight_bit_img,seven_bit_img,six_bit_img,five_bit_img])
finalv =cv2.hconcat([four_bit_img,three_bit_img,two_bit_img,one_bit_img])

# Vertically concatenate
final = cv2.vconcat([finalr,finalv])

# Display the images
cv2.imshow('a',final)
cv2.waitKey(0)

import numpy as np

import cv2

# Read the image in greyscale

img = cv2.imread('D:/Downloads/reference.jpg',0)

#Iterate over each pixel and change pixel value to binary using np.binary_repr() and store it in a list.

lst = []

for i in range(img.shape[0]):

for j in range(img.shape[1]):

lst.append(np.binary_repr(img[i][j] ,width=8)) # width = no. of bits

# We have a list of strings where each string represents binary pixel value. To extract bit planes we need to iterate over the strings and store the characters corresponding to bit planes into lists.

# Multiply with 2^(n-1) and reshape to reconstruct the bit image.

eight_bit_img = (np.array([int(i[0]) for i in lst],dtype = np.uint8) * 128).reshape(img.shape[0],img.shape[1])

seven_bit_img = (np.array([int(i[1]) for i in lst],dtype = np.uint8) * 64).reshape(img.shape[0],img.shape[1])

six_bit_img = (np.array([int(i[2]) for i in lst],dtype = np.uint8) * 32).reshape(img.shape[0],img.shape[1])

five_bit_img = (np.array([int(i[3]) for i in lst],dtype = np.uint8) * 16).reshape(img.shape[0],img.shape[1])

four_bit_img = (np.array([int(i[4]) for i in lst],dtype = np.uint8) * 8).reshape(img.shape[0],img.shape[1])

three_bit_img = (np.array([int(i[5]) for i in lst],dtype = np.uint8) * 4).reshape(img.shape[0],img.shape[1])

two_bit_img = (np.array([int(i[6]) for i in lst],dtype = np.uint8) * 2).reshape(img.shape[0],img.shape[1])

one_bit_img = (np.array([int(i[7]) for i in lst],dtype = np.uint8) * 1).reshape(img.shape[0],img.shape[1])

#Concatenate these images for ease of display using cv2.hconcat()

finalr = cv2.hconcat([eight_bit_img,seven_bit_img,six_bit_img,five_bit_img])

finalv =cv2.hconcat([four_bit_img,three_bit_img,two_bit_img,one_bit_img])

# Vertically concatenate

final = cv2.vconcat([finalr,finalv])

# Display the images

cv2.imshow('a',final)

cv2.waitKey(0)

The output looks like this

**8 bit planes (Top row – 8,7,6,5 ; bottom – 4,3,2,1 bit planes)**

Clearly from the above figure, the last 4 bit planes do not seem to have much information in them.

Now, if we combine the 8,7,6,5 bit planes, we will get approximately the original image as shown below.

This can be done by the following code

# Combining 4 bit planes
new_img = eight_bit_img+seven_bit_img+six_bit_img+five_bit_img
# Display the image
cv2.imshow('a',new_img)
cv2.waitKey(0)

# Combining 4 bit planes

new_img = eight_bit_img+seven_bit_img+six_bit_img+five_bit_img

# Display the image

cv2.imshow('a',new_img)

cv2.waitKey(0)

Clearly, storing these 4 frames instead of the original image requires less space. Thus, it is used in Image Compression.

I hope you understand Bit plane slicing. If you find any other application of this, please let me know. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

TheAILearner

Mastering Artificial Intelligence

Tag Archives: opencv python

Write Text on images at mouse click position using OpenCV-Python

Steps:

Code:

Creating a Snake Game using OpenCV-Python

Import Libraries

Displaying Game Objects

Game Rules

Displaying the final Score

2D Histogram

Understanding Image Histograms

Contrast Stretching

What is Contrast in Image Processing?

Intensity-level Slicing

Power Law (Gamma) Transformations

Creating Subplots in OpenCV-Python

Bit-plane Slicing

Pros:

Code