Tag Archives: geometric transformation

Affine Transformation

In this blog, we will discuss what is affine transformation and how to perform this transformation using OpenCV-Python. So, let’s get started.

What is an Affine Transformation?

An affine transformation is any transformation that preserves collinearity, parallelism as well as the ratio of distances between the points (e.g. midpoint of a line remains the midpoint after transformation). It doesn’t necessarily preserve distances and angles.

Thus all the geometric transformations we discussed so far such as translation, rotation, scaling, etc are all affine transformations as all the above properties are preserved in these transformations. To understand in simple terms, one can think of the affine transformation as a composition of rotation, translation, scaling, and shear.

In general, the affine transformation can be expressed in the form of a linear transformation followed by a vector addition as shown below

Since the transformation matrix (M) is defined by 6 (2×3 matrix as shown above) constants, thus to find this matrix we first select 3 points in the input image and map these 3 points to the desired locations in the unknown output image according to the use-case as shown below (This way we will have 6 equations and 6 unknowns and that can be easily solved).

For instance, if you want to take the mirror image, you can define the 3 points as (you may choose any 3).

Once the transformation matrix is calculated, then we apply the affine transformation to the entire input image to get the final transformed image. Let’s see how to do this using OpenCV-Python.

OpenCV

OpenCV provides a function cv2.getAffineTransform() that takes as input the three pairs of corresponding points and outputs the transformation matrix. The basic syntax is shown below.

transform_mat = cv2.getAffineTransform(src, dst)

# src: coordinates in the source image
# dst: coordinates in the output image

transform_mat = cv2.getAffineTransform(src, dst)

# src: coordinates in the source image

# dst: coordinates in the output image

Once the transformation matrix (M) is calculated, pass it to the cv2.warpAffine() function that applies an affine transformation to an image. The syntax of this function is given below.

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )
 
# src: input image
# M: Transformation matrix
# dsize: size of the output image
# flags: interpolation method to be used

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )

# src: input image

# M: Transformation matrix

# dsize: size of the output image

# flags: interpolation method to be used

Now, let’s take the above example of a mirror image and see how to apply affine transformation using OpenCV-Python. Below are the steps.

Read the image
Define the 3 pairs of corresponding points (See image above)
Calculate the transformation matrix using cv2.getAffineTransform()
Apply the affine transformation using cv2.warpAffine()

import cv2
import numpy as np

# Read the image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows, cols = img.shape[:2]

# Define the 3 pairs of corresponding points 
input_pts = np.float32([[0,0], [cols-1,0], [0,rows-1]])
output_pts = np.float32([[cols-1,0], [0,0], [cols-1,rows-1]])

# Calculate the transformation matrix using cv2.getAffineTransform()
M= cv2.getAffineTransform(input_pts , output_pts)

# Apply the affine transformation using cv2.warpAffine()
dst = cv2.warpAffine(img, M, (cols,rows))

# Display the image
out = cv2.hconcat([img, dst])
cv2.imshow('Output', out)
cv2.waitKey(0)

import cv2

import numpy as np

# Read the image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows, cols = img.shape[:2]

# Define the 3 pairs of corresponding points

input_pts = np.float32([[0,0], [cols-1,0], [0,rows-1]])

output_pts = np.float32([[cols-1,0], [0,0], [cols-1,rows-1]])

# Calculate the transformation matrix using cv2.getAffineTransform()

M= cv2.getAffineTransform(input_pts , output_pts)

# Apply the affine transformation using cv2.warpAffine()

dst = cv2.warpAffine(img, M, (cols,rows))

# Display the image

out = cv2.hconcat([img, dst])

cv2.imshow('Output', out)

cv2.waitKey(0)

Below is the output image. Here, left image represents the original image while the right one is the transformed mirror image.

Now define the 3 different point pairs and see how the transformation looks like. That’s all for this blog. In the next blog, we will discuss Perspective transformation. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Geometric Transformation: Rotation using OpenCV-Python

Leave a reply

In the previous blog, we discussed image translation. In this blog, we will discuss another type of transformation known as rotation. So, let’s get started. (Here, we will use a Left-hand coordinate system commonly used in image processing).

Suppose we have a point P(x,y) at an angle alpha and distance r from the origin as shown below. Now we rotate the point P about the origin by an angle theta in the clockwise direction. The rotated coordinates can be obtained as shown below.

So, we just need to create the transformation matrix (M) and then we can rotate any point as shown above. That’s the basic idea behind rotation. Now, let’s take the case with an adjustable center of rotation O(x₀, y₀).

Note: The above expression is for clockwise rotation. For anti-clockwise minor changes in the sign will occur. You can easily derive that.

Numpy

For the numpy implementation, you can refer to the previous blog. You just need to change the transformation matrix and rest everything is the same. Below is the code for this using numpy. For an explanation, you can refer to the previous blog.

import numpy as np
import cv2

# Read an image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows,cols,_ = img.shape

# Create the transformation matrix
angle = np.radians(90)
x0, y0 = ((cols-1)/2.0,(rows-1)/2.0)
M = np.array([[np.cos(angle), -np.sin(angle), x0*(1-np.cos(angle))+ y0*np.sin(angle)],
              [np.sin(angle), np.cos(angle), y0*(1-np.cos(angle))- x0*np.sin(angle)]])
# get the coordinates in the form of (0,0),(0,1)...
# the shape is (2, rows*cols)
orig_coord = np.indices((cols, rows)).reshape(2,-1)
# stack the rows of 1 to form [x,y,1]
orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))
transform_coord = np.dot(M, orig_coord_f)
# Change into int type
transform_coord = transform_coord.astype(np.int)
# Keep only the coordinates that fall within the image boundary.
indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)
# Create a zeros image and project the points
img1 = np.zeros_like(img)
img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]
# Display the image
out = cv2.hconcat([img,img1])
cv2.imshow('a2',out)
cv2.waitKey(0)

import numpy as np

import cv2

# Read an image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows,cols,_ = img.shape

# Create the transformation matrix

angle = np.radians(90)

x0, y0 = ((cols-1)/2.0,(rows-1)/2.0)

M = np.array([[np.cos(angle), -np.sin(angle), x0*(1-np.cos(angle))+ y0*np.sin(angle)],

[np.sin(angle), np.cos(angle), y0*(1-np.cos(angle))- x0*np.sin(angle)]])

# get the coordinates in the form of (0,0),(0,1)...

# the shape is (2, rows*cols)

orig_coord = np.indices((cols, rows)).reshape(2,-1)

# stack the rows of 1 to form [x,y,1]

orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

transform_coord = np.dot(M, orig_coord_f)

# Change into int type

transform_coord = transform_coord.astype(np.int)

# Keep only the coordinates that fall within the image boundary.

indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

# Create a zeros image and project the points

img1 = np.zeros_like(img)

img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

# Display the image

out = cv2.hconcat([img,img1])

cv2.imshow('a2',out)

cv2.waitKey(0)

Below is the output image for the 90-degree clockwise rotation. Here, the left image represents the original image while the right one is the rotated image.

While rotating an image, you may encounter an aliasing effect or holes in the output image as shown below for 45-degree rotation. This can be easily tackled using interpolation.

OpenCV

Now, let’s discuss how to rotate images using OpenCV-Python. In order to obtain the transformation matrix (M), OpenCV provide a function cv2.getRotationMatrix2D() which takes center, angle and scale as arguments and outputs the transformation matrix. The syntax of this function is given below.

transform_matrix = cv2.getRotationMatrix2D(center, angle, scale)

#center: Center of the rotation in the source image.
#angle: Rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).
#scale:	Isotropic scale factor.

transform_matrix = cv2.getRotationMatrix2D(center, angle, scale)

#center: Center of the rotation in the source image.

#angle: Rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner).

#scale: Isotropic scale factor.

Once the transformation matrix (M) is calculated, pass it to the cv2.warpAffine() function that applies an affine transformation to an image. The syntax of this function is given below.

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )
 
# src: input image
# M: Transformation matrix
# dsize: size of the output image
# flags: interpolation method to be used

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )

# src: input image

# M: Transformation matrix

# dsize: size of the output image

# flags: interpolation method to be used

Below is an example where the image is rotated by 90 degrees counterclockwise with respect to the center without any scaling.

import numpy as np
import cv2

# Read an image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows,cols,_ = img.shape

# Create the transformation matrix
M = cv2.getRotationMatrix2D(((cols-1)/2.0,(rows-1)/2.0),90,1)

# Pass it to warpAffine function
dst = cv2.warpAffine(img,M,(cols,rows))

# Display the concatenated image
out = cv2.hconcat([img, dst])
cv2.imshow('img',out)
cv2.waitKey(0)

import numpy as np

import cv2

# Read an image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows,cols,_ = img.shape

# Create the transformation matrix

M = cv2.getRotationMatrix2D(((cols-1)/2.0,(rows-1)/2.0),90,1)

# Pass it to warpAffine function

dst = cv2.warpAffine(img,M,(cols,rows))

# Display the concatenated image

out = cv2.hconcat([img, dst])

cv2.imshow('img',out)

cv2.waitKey(0)

Below is the output. Here, left image represents the original image while the right one is the rotated image.

Compare the outputs of both implementations. That’s all for image rotation. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Understanding Geometric Transformation: Translation using OpenCV-Python

Leave a reply

In this blog, we will discuss image translation one of the most basic geometric transformations, that is performed on images. So, let’s get started.

Translation is simply the shifting of object location. Suppose we have a point P(x,y) which is translated by (t_x, t_y), then the coordinates after translation denoted by P'(x’,y’) are given by

So, we just need to create the transformation matrix (M) and then we can translate any point as shown above. That’s the basic idea behind translation. So, let’s first discuss how to do image translation using numpy for better understanding, and then we will see a more sophisticated implementation using OpenCV.

Numpy

First, let’s create the transformation matrix (M). This can be easily done using numpy as shown below. Here, the image is translated by (100, 50)

M = np.float32([[1,0,100],[0,1,50]])

1	M = np.float32([[1,0,100],[0,1,50]])

Next, let’s convert the image coordinates to the form [x,y,1]. This can be done as

# get the coordinates in the form of (0,0),(0,1)...
# the shape is (2, rows*cols)
orig_coord = np.indices((cols, rows)).reshape(2,-1)
# stack the rows of 1 to form [x,y,1]
orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

# get the coordinates in the form of (0,0),(0,1)...

# the shape is (2, rows*cols)

orig_coord = np.indices((cols, rows)).reshape(2,-1)

# stack the rows of 1 to form [x,y,1]

orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

Now apply the transformation by multiplying the transformation matrix with coordinates.

transform_coord = np.dot(M, orig_coord_f)
# Change into int type
transform_coord = transform_coord.astype(np.int)

transform_coord = np.dot(M, orig_coord_f)

# Change into int type

transform_coord = transform_coord.astype(np.int)

Keep only the coordinates that fall within the image boundary.

indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

1	indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

Now, create a zeros image similar to the original image and project all the points onto the new image.

img1 = np.zeros_like(img)
img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

1 2	img1 = np.zeros_like(img) img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

Display the final image.

out = cv2.hconcat([img,img1])
cv2.imshow('a1',out)
cv2.waitKey(0)

out = cv2.hconcat([img,img1])

cv2.imshow('a1',out)

cv2.waitKey(0)

The full code can be found below

import numpy as np
import cv2

# Read an image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows,cols,_ = img.shape

# Create the transformation matrix
M = np.float32([[1,0,100],[0,1,50]])
# get the coordinates in the form of (0,0),(0,1)...
# the shape is (2, rows*cols)
orig_coord = np.indices((cols, rows)).reshape(2,-1)
# stack the rows of 1 to form [x,y,1]
orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))
transform_coord = np.dot(M, orig_coord_f)
# Change into int type
transform_coord = transform_coord.astype(np.int)
# Keep only the coordinates that fall within the image boundary.
indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)
# Create a zeros image and project the points
img1 = np.zeros_like(img)
img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]
# Display the image
out = cv2.hconcat([img,img1])
cv2.imshow('a2',out)
cv2.waitKey(0)

import numpy as np

import cv2

# Read an image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows,cols,_ = img.shape

# Create the transformation matrix

M = np.float32([[1,0,100],[0,1,50]])

# get the coordinates in the form of (0,0),(0,1)...

# the shape is (2, rows*cols)

orig_coord = np.indices((cols, rows)).reshape(2,-1)

# stack the rows of 1 to form [x,y,1]

orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols)))

transform_coord = np.dot(M, orig_coord_f)

# Change into int type

transform_coord = transform_coord.astype(np.int)

# Keep only the coordinates that fall within the image boundary.

indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0)

# Create a zeros image and project the points

img1 = np.zeros_like(img)

img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]]

# Display the image

out = cv2.hconcat([img,img1])

cv2.imshow('a2',out)

cv2.waitKey(0)

Below is the output. Here, left image represents the original image while the right one is the translated image.

OpenCV-Python

Now, let’s discuss how to translate images using OpenCV-Python.

OpenCV provides a function cv2.warpAffine() that applies an affine transformation to an image. You just need to provide the transformation matrix (M). The basic syntax for the function is given below.


dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]	)

# src: input image
# M: Transformation matrix
# dsize: size of the output image
# flags: interpolation method to be used

dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )

# src: input image

# M: Transformation matrix

# dsize: size of the output image

# flags: interpolation method to be used

Below is a sample code where the image is translated by (100, 50).

import numpy as np
import cv2

# Read an image
img = cv2.imread('D:/downloads/opencv_logo.PNG')
rows,cols,_ = img.shape

# Create the transformation matrix
M = np.float32([[1,0,100],[0,1,50]])

# Pass it to warpAffine function
dst = cv2.warpAffine(img,M,(cols,rows))

# Display the concatenated image
out = cv2.hconcat([img, dst])
cv2.imshow('img',out)
cv2.waitKey(0)

import numpy as np

import cv2

# Read an image

img = cv2.imread('D:/downloads/opencv_logo.PNG')

rows,cols,_ = img.shape

# Create the transformation matrix

M = np.float32([[1,0,100],[0,1,50]])

# Pass it to warpAffine function

dst = cv2.warpAffine(img,M,(cols,rows))

# Display the concatenated image

out = cv2.hconcat([img, dst])

cv2.imshow('img',out)

cv2.waitKey(0)

Below is the output. Here, left image represents the original image while the right one is the translated image.

Compare the outputs of both implementations. That’s all for image translation. In the next blog, we will discuss another geometric transformation known as rotation in detail. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Geometric Transformation of images using OpenCV-Python

Leave a reply

Before reading, please refer to this blog for better understanding.

In this blog, we will discuss how to perform a geometric transformation using OpenCV-Python. In geometric transformation, we move the pixels of an image based on some mathematical formulae. This involves translation, rotation, scaling, and distortion (or undistortion!) of images. This is frequently used as a pre-processing step in many applications where the input is distorted while capturing like document scanning, matching temporal images in remote sensing and many more.

There are two basic steps in geometric transformation

Spatial Transformation: Calculating the spatial position of pixels in the transformed image.
Intensity Interpolation: Finding the intensity values at the newly calculated positions.

OpenCV has built-in functions to apply the different geometric transformations to images like translation, rotation, affine transformation, etc. You can find all the functions here: Geometric Transformations of Images

In this blog, we will learn how to change the apparent perspective of an image. This will make the image look more clear and easy to read. Below image summarizes what we want to do. See how easily we can read the words in the corrected image.

For perspective transformation, we need 4 points on the input image and corresponding points on the output image. The points should be selected counterclockwise. From these points, we will calculate the transformation matrix which when applied to the input image yields the corrected image. Let’s see the steps using OpenCV-Python

Steps:

Load the image
Convert the image to RGB so as to display via matplotlib
Select 4 points in the input image (counterclockwise, starting from the top left) by using matplotlib interactive window.
Specify the corresponding output coordinates.
Compute the perspective transform M using cv2.getPerspectiveTransform()
Apply the perspective transformation to the input image using cv2.warpPerspective() to obtain the corrected image.

Code:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# To open matplotlib in interactive mode
%matplotlib qt

# Load the image
img = cv2.imread('D:/downloads/geometric.jpg') 

# Create a copy of the image
img_copy = np.copy(img)

# Convert to RGB so as to display via matplotlib
# Using Matplotlib we can easily find the coordinates
# of the 4 points that is essential for finding the 
# transformation matrix
img_copy = cv2.cvtColor(img_copy,cv2.COLOR_BGR2RGB)

plt.imshow(img_copy)

import cv2

import numpy as np

import matplotlib.pyplot as plt

# To open matplotlib in interactive mode

%matplotlib qt

# Load the image

img = cv2.imread('D:/downloads/geometric.jpg')

# Create a copy of the image

img_copy = np.copy(img)

# Convert to RGB so as to display via matplotlib

# Using Matplotlib we can easily find the coordinates

# of the 4 points that is essential for finding the

# transformation matrix

img_copy = cv2.cvtColor(img_copy,cv2.COLOR_BGR2RGB)

plt.imshow(img_copy)

By running the above code you will get an interactive matplotlib window popup. Now select any four points(better to select corner points) for the inputs. Then specify the corresponding output points.

# Specify input and output coordinates that is used
# to calculate the transformation matrix
input_pts = np.float32([[80,1286],[3890,1253],[3890,122],[450,115]])
output_pts = np.float32([[100,100],[100,3900],[2200,3900],[2200,100]])

# Compute the perspective transform M
M = cv2.getPerspectiveTransform(input_pts,output_pts)

# Apply the perspective transformation to the image
out = cv2.warpPerspective(img,M,(img.shape[1], img.shape[0]),flags=cv2.INTER_LINEAR)

# Display the transformed image
plt.imshow(out)

# Specify input and output coordinates that is used

# to calculate the transformation matrix

input_pts = np.float32([[80,1286],[3890,1253],[3890,122],[450,115]])

output_pts = np.float32([[100,100],[100,3900],[2200,3900],[2200,100]])

# Compute the perspective transform M

M = cv2.getPerspectiveTransform(input_pts,output_pts)

# Apply the perspective transformation to the image

out = cv2.warpPerspective(img,M,(img.shape[1], img.shape[0]),flags=cv2.INTER_LINEAR)

# Display the transformed image

plt.imshow(out)

Hope you enjoy reading.

If you have any doubt/suggestion please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

TheAILearner

Mastering Artificial Intelligence

Tag Archives: geometric transformation

Affine Transformation

What is an Affine Transformation?

OpenCV

Understanding Geometric Transformation: Rotation using OpenCV-Python

Numpy

OpenCV

Understanding Geometric Transformation: Translation using OpenCV-Python

Numpy

OpenCV-Python

Geometric Transformation of images using OpenCV-Python

Steps:

Code: