Tag Archives: image processing

Image Processing – Bicubic Interpolation

In the last blog, we discussed what is Bi-linear interpolation and how it is performed on images. In this blog, we will learn Bi-cubic interpolation in detail.

Note: We will be using some concepts from the Nearest Neighbour and Bilinear interpolation blog. Check them first before moving forward.

Difference between Bi-linear and Bi-cubic:

  1. Bi-linear uses 4 nearest neighbors to determine the output, while Bi-cubic uses 16 (4×4 neighbourhood).
  2. Weight distribution is done differently.

So, the only thing we need to know is how weights are distributed and rest is same as Bi-linear.

In OpenCV, weights are distributed according to the following code (whole code can be found here)

x used in the above code is calculated from below code where x = fx

Similarly, for y, replace x with fy and fy can be obtained by replacing dx and scale_x in the above code by dy and scale_y respectively (Explained in the previous blog).

Note: For Matlab, use A= -0.50

Let’s see an example. We take the same 2×2 image from the previous blog and want to upscale it by a factor of 2 as shown below

Steps:

  • In the last blog, we calculated for P1. This time let’s take ‘P2’. First, we find the position of P2 in the input image as we did before. So, we find P2 coordinate as (0.75,0.25) with dx = 1 and dy=0.
  • Because cubic needs 4 pixels (2 on left and 2 on right) so, we pad the input image.
  • OpenCV has different methods to add borders which you can check here. Here, I used cv2.BORDER_REPLICATE method. You can use any. After padding the input image looks like this
After padding, Blue square is the input image
  • To find the value of P2, let’s first visualize where P2 is in the image. Yellow is the input image before padding. We take the blue 4×4 neighborhood as shown below
  • For P2, using dx and dy we calculate fx and fy from code above. We get, fx=0.25 and fy=0.75
  • Now, we substitute fx and fy in the above code to calculate the four coefficients. Thus we get coefficients = [-0.0351, 0.2617,0.8789, -0.1055] for fy =0.75 and for fx=0.25 we get coefficients = [ -0.1055 , 0.8789, 0.2617, -0.0351]
  • First, we will perform cubic interpolation along rows( as shown in the above figure inside blue box) with the above calculated weights for fx as
    -0.1055 *10 + 0.8789*10 + 0.2617*20 -0.0351*20 = 12.265625
    -0.1055 *10 + 0.8789*10 + 0.2617*20 -0.0351*20 = 12.265625
    -0.1055 *10 + 0.8789*10 + 0.2617*20 -0.0351*20 = 12.265625
    -0.1055 *30 + 0.8789*30 + 0.2617*40 -0.0351*40 = 32.265625
  • Now, using above calculated 4 values, we will interpolate along columns using calculated weights for fy as
    -0.0351*12.265 + 0.2617*12.265 + 0.8789*12.265 -0.1055*32.625 = 10.11702
  • Similarly, repeat for other pixels.

The final result we get is shown below:

This produces noticeably sharper images than the previous two methods and balances processing time and output quality. That’s why it is used widely (e.g. Adobe Photoshop etc.)

In the next blog, we will see these interpolation methods using OpenCV functions on real images. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Processing – Bilinear Interpolation

In the previous blog, we learned how to find the pixel coordinate in the input image and then we discussed nearest neighbour algorithm. In this blog, we will discuss Bi-linear interpolation method in detail.

Bi-linear interpolation means applying a linear interpolation in two directions. Thus, it uses 4 nearest neighbors, takes their weighted average to produce the output

So, let’s first discuss what is linear interpolation and how it is performed?

Linear interpolation means we estimate the value using linear polynomials. Suppose we have 2 points having value 10 and 20 and we want to guess the values in between them. Simple Linear interpolation looks like this

More weight is given to the nearest value(See 1/3 and 2/3 in the above figure). For 2D (e.g. images), we have to perform this operation twice once along rows and then along columns that is why it is known as Bi-Linear interpolation.

Algorithm for Bi-linear Interpolation:

Suppose we have 4 pixels located at (0,0), (1,0), (0,1) and (1,1) and we want to find value at (0.3,0.4).

  1. First, find the value along rows i.e at position A:(0,0.4) and B:(1,0.4) by linear interpolation.
  2. After getting the values at A and B, apply linear interpolation for point (0.3,0.4) between A and B and this is the final result.

Let’s see how to do this for images. We take the same 2×2 image from the previous blog and want to upscale it by a factor of 2 as shown below

Same assumptions as we took in the last blog, pixel is of size 1 and is located at the center.

  • Let’s take ‘P1’. First, we find the position of P1 in the input image. By projecting the 4×4 image on the input 2×2 image we get the coordinates of P1 as (0.25,0.25). (For more details, See here)
  • Since P1 is the border pixel and has no values to its left, so OpenCV replicates the border pixel. This means the row or column at the very edge of the original is replicated to the extra border(padding). OpenCV has different methods to add borders which you can check here.
  • So, now our input image (after border replication) looks like this. Note the values in red shows the input image.
  • To find the value of P1, let’s first visualize where P1 is in the input image (previous step image). Below figure shows the upper left 2×2 input image region and the location of P1 in that.
Image-1
  • Before applying Bi-linear interpolation let’s see how weights are distributed.

Both Matlab and OpenCV yield different results for interpolation because their weight distribution is done differently. Here, I will only explain for OpenCV.

In OpenCV, weights are distributed according to this equation

Where dx is the column index of the unknown pixel and fx is the weight that is assigned to the right pixel, 1-fx is given to the left pixel. Scale_x is the ratio of input width by output width. Similarly, for y, dy is the row index and scale_y is the ratio of heights now.

After knowing how weights are calculated let’s get back to the problem again.

  • For P1, both row and column index i.e dx, and dy =0 so, fx = 0.75 and fy =0.75.
  • We apply linear interpolation with weights fx for both A and B(See Image-1) as 0.75*10(right) + 0.25*10 = 10 (Explained in the Algorithm above)
  • Now, for P1 apply linear interpolation between A and B with the weights fy as 0.75*10(B) +0.25*10(A) = 10
  • So, we get P1 =10. Similarly, repeat for other pixels.

The final result we get is shown below:

This produces smoother results than the nearest neighbor but, the results for sharp transitions like edges, are not ideal.

In the next blog, we will discuss Bi-cubic interpolation. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Processing – Nearest Neighbour Interpolation

In the previous blog, we discussed image interpolation, its types and why we need interpolation. In this blog, we will discuss the Nearest Neighbour, a non-adaptive interpolation method in detail.

Algorithm: We assign the unknown pixel to the nearest known pixel.

Let’s see how this works. Suppose, we have a 2×2 image and let’s say we want to upscale this by a factor of 2 as shown below.

Let’s pick up the first pixel (denoted by ‘P1’) in the unknown image. To assign it a value, we must find its nearest pixel in the input 2×2 image. Let’s first see some facts and assumptions used in this.

Assumption: a pixel is always represented by its center value. Each pixel in our input 2×2 image is of unit length and width.

Indexing in OpenCV starts from 0 while in matlab it starts from 1. But for the sake of simplicity, we will start indexing from 0.5 which means that our first pixel is at 0.5 next at 1.5 and so on as shown below.

So for the above example, the location of each pixel in input image is {’10’:(0.5,0.5), ’20’:(1.5,0.5), ’30’:(0.5,1.5), ’40’:(1.5,1.5)}.

After finding the location of each pixel in the input image, follow these 2 steps

  1. First, find the position of each pixel (of the unknown image) in the input image. This is done by projecting the 4×4 image on the 2×2 image. So, we can easily find out the coordinates of each unknown pixel e.g location of ‘P1’ in the input image is (0.25,0.25), for ‘P2’ (0.75,0.25) and so on.
  2. Now, compare the above-calculated coordinates of each unknown pixel with the input image pixels to find out the nearest pixel e.g. ‘P1′(0.25,0.25) is nearest to 10 (0.5,0.5) so we assign ‘P1’ value of 10. Similarly, for other pixels, we can find their nearest pixel.

The final result we get is shown in figure below:

This is the fastest interpolation method as it involves little calculation. This results in a pixelated or blocky image. This has the effect of simply making each pixel bigger

Application: To resize bar-codes.

Shortcut: Simply duplicate the rows and columns to get the interpolated or zoomed image e.g. for 2x, we duplicate each row and column 2 times.

In the next blog, we will discuss Bi-linear interpolation method. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Arithmetic Operations for Image Enhancement

In this blog, we will learn how simple arithmetic operations like addition, subtraction etc can be used for image enhancement. First, let’s start with image addition also known as Image averaging.

Image Averaging

This is based on the assumption that noise present in the image is purely random(uncorrelated) and thus has zero average value. So, if we average n noisy images of same source, the noise will cancel out and what we get is approximately the original image.

Applicability Conditions: Images should be taken under identical conditions with same camera settings like in the field of astronomy.

Advantages: Reduce noise without compromising image details unlike most other operations like filtering.

Disadvantages: Increases time and storage as now one needs to take multiple photos of the same object. Only applicable for random noise. Must follow the above applicability condition.

Below is the code where first I generated 20 images by adding random noise to the original image and then average these images to get the approx. original image.

cv2.randn(image, mean, standard deviation) fills the image with normally distributed random numbers with specified mean and standard deviation.

Noisy
Averaged

Image Subtraction

This is mainly used to enhance the difference between images. Used for background subtraction for detecting moving objects, in medical science for detecting blockage in the veins etc a field known as mask mode radiography. In this, we take 2 images, one before injecting a contrast medium and other after injecting. Then we subtract these 2 images to know how that medium propagated, is there any blockage or not.

Image Multiplication

This can be used to extract Region of interest (ROI) from an image. We simply create a mask and multiply the image with the mask to get the area of interest. Other applications can be shading correction which we will discuss in detail in the next blogs.

In the next blog, we will discuss intensity transformation, a spatial domain image enhancement technique. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Enhancement

Till now, we learned the basics of an image. From now onwards, we will learn what actually is known as image processing. In this blog, we will learn what is image enhancement, different methods to perform image enhancement and then we will learn how we can perform this on real images.

According to MathWorks, Image enhancement is the process of adjusting digital images so that the results are more suitable for display or further image analysis. It is basically a preprocessing step.

Image enhancement can be done either in the spatial domain or transform domain. Spatial domain means we perform all operations directly on pixels while in transform domain we first transform an image into another domain (like frequency) do processing there and convert it back to the spatial domain by some inverse operations. We will be discussing these in detail in the next blogs.

Both spatial and transform domain have their own importance which we will discuss later. Generally, operations in spatial domain are more computationally efficient.

Processing in spatial domain can be divided into two main categories – one that operates on single pixels known as Intensity transformation and other known as Spatial filtering that works on the neighborhood of every pixel

The following example will motivate you about what we are going to study in the next few blogs

Before Contrast Enhancement
After Contrast Enhancement

In the next blog, we will discuss how basic arithmetic operations like addition, subtraction etc can be used for image enhancement. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Image Interpolation using OpenCV-Python

In the previous blogs, we discussed the algorithm behind the

  1. nearest neighbor 
  2. bilinear and
  3. bicubic interpolation methods using a 2×2 image.

Now, let’s do the same using OpenCV on a real image. First, let’s take an image, either you can load one or can make own image. Loading an image from the device looks like this

This is a 20×22 apple image that looks like this.

Now, let’s zoom it 10 times using each interpolation method. The OpenCV command for doing this is

where fx and fy are scale factors along x and y, dsize refers to the output image size and the interpolation flag refers to which method we are going to use. Either you specify (fx, fy) or dsize, OpenCV calculates the other automatically. Let’s see how to use this function

Nearest Neighbor Interpolation

In this we use cv2.INTER_NEAREST as the interpolation flag in the cv2.resize() function as shown below

Output: 

Clearly, this produces a pixelated or blocky image. Also, it doesn’t introduce any new data.

Bilinear Interpolation

In this we use cv2.INTER_LINEAR flag as shown below

Output: 

This produces a smooth image than the nearest neighbor but the results for sharp transitions like edges are not ideal because the results are a weighted average of 2 surrounding pixels.

Bicubic Interpolation

In this we use cv2.INTER_CUBIC flag as shown below

Output: 

Clearly, this produces a sharper image than the above 2 methods. See the white patch on the left side of the apple. This method balances processing time and output quality fairly well.

Next time, when you are resizing an image using any software, wisely use the interpolation method as this can affect your result to a great extent. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Images with OpenCV-Python

In the previous blogs, we learned about pixels, intensity value, color, and greyscale image. Now, with OpenCV and numpy, let’s try to visualize these concepts.

First, we need an image, either you can load one or can make own image. Loading an image from the device looks like this

To access the pixel location, we must first know the shape of the image. This can be done by

It returns a tuple of the number of rows, columns, and channels (if the image is color).

Total number of pixels can be found either by multiplying rows, columns, channels found using img.shape or by using the following command

After knowing the image shape, we can access the pixel location by its row and column coordinates as

This returns the intensity value at that pixel location. For a greyscale image, intensity or pixel value is a single integer while for a color image, it is an array of Blue, Green, Red values.

Note: OpenCV reads the color image in BGR mode and not in RGB mode. Be careful

We know that intensity levels depend on the number of bits that can be found by

To access the RGB channels separately, use numpy indexing as shown below

You can change the pixel value just by normal assignment as shown below

You can change the color image to greyscale using the following command

All the operations that you can perform on the array like add, subtract etc apply to images also.

Play with all these commands to understand better. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Greyscale and Color Image

In the previous blog, we learned how the digital image is formed but we didn’t discuss whether the image formed was greyscale or colored. In this blog, let’s go through this.

A greyscale image is the one where pixels contains only the intensity information and not wavelength. For example, a pixel value of 165 represents the amount of light captured by the sensor(pixel). It doesn’t tell how much of this light belongs to red wavelength or any other.

For an 8 bit image, there will be 28=256 intensity levels where 0 means no light(black) and 255 means White light and in between levels represents shades of grey as shown below

The above image is created using numpy and OpenCV (See here for code).

A color image, on the other hand, contains intensity information corresponding to three wavelengths red, green, and blue (RGB)collectively called primary colors or channels. The reason for choosing these 3 colors lies in the fact that cone cells in the human eye, that is responsible for color vision, are sensitive to red, green, and blue light.

Note: Combining the primary colors(their wavelength may vary) in various intensity proportions, we can produce all visible colors.

So, in color image, corresponding to every pixel, we have 3 intensity values Red, Green, Blue (for example 160,30,45) that produces different colors in the image.

How a Color image is formed?

To form a color image, we want something that absorbs light in red, green, and blue wavelengths. The advancements in color image formation can be summarized by the figure below

Source: Foveon

The drastic advancement in color image formation came with the introduction of digital sensors like CCD and CMOS. Most of the digital cameras today only contain one imaging sensor so they cannot collect red, green, and blue information simultaneously at each pixel location. One possible solution is we use 3 imaging sensors with RGB filters but that is expensive in terms of money and time.

So, in 1976 Bryce Bayer of Eastman Kodak, invented the Bayer filter that revolutionized color image formation. Bayer filter is a color filter array(CFA) in which RGB color filters are arranged in a pattern on the sensor. Below figure shows a Bayer filter.

Source: Wikipedia

In the bayer filter, there are twice as many green elements as red or blue to mimic the human eye’s greater resolving power with the green light. In order to construct an RGB picture, we calculate Green and Blue values for each Red pixel, Blue and Red values for each Green pixel and Red and Green values for each Blue pixel through interpolation or color demosaicing algorithm (For more details See here).

Foveon X3 direct image sensor is the most advanced color image sensor. This combines the power of digital sensor with the essence of the film. Like the film, this has three layers of pixels embedded in silicon. The layers are positioned to take advantage of the fact that silicon absorbs red, green, and blue light to different depths. The bottom layer records red, the middle layer records green and the top layer records blue. Pros are superior quality and less processing power.

Now, you might have got some feeling about the Color image and how it is formed. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Create own image using Numpy and OpenCV

In this tutorial, we will learn how we can create our own image using numpy and OpenCV. This will help you in understanding the image concepts more. Let’s see how the 256 intensity levels for an 8-bit image looks like.

Steps:

  1.  Create an array of any desired size using numpy. Always specify the ‘datatype’
  2.  Fill the values of the array using some logic
  3.  Show the image using cv2.imshow() or matplotlib.

Code:

The resulting image looks like this

To create a color image just specify the third dimension of the array (for example (100,256,3)) and rest is same.

Now, you are ready to create your own images. Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Recording a specific Window using OpenCV-Python

In this blog, we will learn how to record any window using OpenCV-Python.

Installing Libraries:

  1. To install PIL and pywin32, write

    in the cmd(See here). pywin32 will install win32gui, win32api, and win32con.
  2. For installing winGuiAuto, download winGuiAuto.py from here. Save this as winGuiAuto.py in the python -> lib -> site-packages.

Steps:

  1.  Create the window that you want to record (I used cv2.imshow() for that)
  2.  Give the window name that you want to record in winGuiAuto.findTopWindow()
  3.  Keep the window on top and set its position using win32gui.SetWindowPos()
  4.  Get the coordinates of the window using win32gui.GetWindowPlacement()
  5.  Grab an image of the area using ImageGrab.grab()
  6.  Append all these images into a list.
  7.  Create a VideoWriter object using cv2.VideoWriter()
  8.  Convert each image color and save it.

Code:

If you want to make a .gif file, uncomment the last part.

Note: This works well for windows 8.1, but you might find some difficulty in capturing full window in windows 10.

Hope you enjoy reading. If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.