Perspective Transformation

In this blog, we will discuss what is perspective transformation and how to perform this transformation using OpenCV-Python. So, let’s get started.

What is Perspective Transformation?

As clear from the name, the perspective transformation is associated with the change in the viewpoint. This type of transformation does not preserve parallelism, length, and angle. But they do preserve collinearity and incidence. This means that the straight lines will remain straight even after the transformation.

In general, the perspective transformation can be expressed as

Here, (x’, y’) are the transformed points while (x, y) are the input points. The transformation matrix (M) can be seen as a combination of

For affine transformation, the projection vector is equal to 0. Thus, affine transformation can be considered as a particular case of perspective transformation.

Since the transformation matrix (M) is defined by 8 constants(degree of freedom), thus to find this matrix we first select 4 points in the input image and map these 4 points to the desired locations in the unknown output image according to the use-case (This way we will have 8 equations and 8 unknowns and that can be easily solved).

Once the transformation matrix is calculated, then we apply the perspective transformation to the entire input image to get the final transformed image. Let’s see how to do this using OpenCV-Python.

OpenCV

OpenCV provides a function cv2.getPerspectiveTransform() that takes as input the 4 pairs of corresponding points and outputs the transformation matrix. The basic syntax is shown below.

transform_mat = cv2.getPerspectiveTransform(src, dst)

# src: coordinates in the source image
# dst: coordinates in the output image

transform_mat = cv2.getPerspectiveTransform(src, dst)

# src: coordinates in the source image

# dst: coordinates in the output image

Once the transformation matrix (M) is calculated, pass it to the cv2.warpPerspective() function that applies the perspective transformation to an image. The syntax of this function is given below.

dst = cv.warpPerspective(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )
 
# src: input image
# M: Transformation matrix
# dsize: size of the output image
# flags: interpolation method to be used

dst = cv.warpPerspective(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] )

# src: input image

# M: Transformation matrix

# dsize: size of the output image

# flags: interpolation method to be used

Now, let’s take a very classic example of perspective transform to obtain a top-down, “birds-eye view” of an image. Let’s understand step by step how to perform perspective transform using the below image.

Below is the image showing the basic idea which we will perform.

First, we will select the representative points (usually the corner points) for our region of interest (ROI). This can be done manually using matplotlib as shown below.

import cv2
import numpy as np
import matplotlib.pyplot as plt
 
# To open matplotlib in interactive mode
%matplotlib qt5
 
# Load the image
img = cv2.imread('D:/downloads/deco.jpg') 
 
# Create a copy of the image
img_copy = np.copy(img)
 
# Convert to RGB so as to display via matplotlib
# Using Matplotlib we can easily find the coordinates
# of the 4 points that is essential for finding the 
# transformation matrix
img_copy = cv2.cvtColor(img_copy,cv2.COLOR_BGR2RGB)
 
plt.imshow(img_copy)

import cv2

import numpy as np

import matplotlib.pyplot as plt

# To open matplotlib in interactive mode

%matplotlib qt5

# Load the image

img = cv2.imread('D:/downloads/deco.jpg')

# Create a copy of the image

img_copy = np.copy(img)

# Convert to RGB so as to display via matplotlib

# Using Matplotlib we can easily find the coordinates

# of the 4 points that is essential for finding the

# transformation matrix

img_copy = cv2.cvtColor(img_copy,cv2.COLOR_BGR2RGB)

plt.imshow(img_copy)

The above code opens up an interactive window. Use the mouse pointer to obtain the 4 corner coordinates which are shown below. Points are ordered anti-clockwise as shown in the above figure.

# All points are in format [cols, rows]
pt_A = [41, 2001]
pt_B = [2438, 2986]
pt_C = [3266, 371]
pt_D = [1772, 136]

# All points are in format [cols, rows]

pt_A = [41, 2001]

pt_B = [2438, 2986]

pt_C = [3266, 371]

pt_D = [1772, 136]

Then we will map these 4 points to the desired locations in the unknown output image according to the use-case. Here, since we used the corner points so we will derive the width and height of the output image from these 4 points as shown below (You can also manually specify the mapping). Below I have used the L2 norm. You can use L1 also.

# Here, I have used L2 norm. You can use L1 also.
width_AD = np.sqrt(((pt_A[0] - pt_D[0]) ** 2) + ((pt_A[1] - pt_D[1]) ** 2))
width_BC = np.sqrt(((pt_B[0] - pt_C[0]) ** 2) + ((pt_B[1] - pt_C[1]) ** 2))
maxWidth = max(int(width_AD), int(width_BC))


height_AB = np.sqrt(((pt_A[0] - pt_B[0]) ** 2) + ((pt_A[1] - pt_B[1]) ** 2))
height_CD = np.sqrt(((pt_C[0] - pt_D[0]) ** 2) + ((pt_C[1] - pt_D[1]) ** 2))
maxHeight = max(int(height_AB), int(height_CD))

# Here, I have used L2 norm. You can use L1 also.

width_AD = np.sqrt(((pt_A[0] - pt_D[0]) ** 2) + ((pt_A[1] - pt_D[1]) ** 2))

width_BC = np.sqrt(((pt_B[0] - pt_C[0]) ** 2) + ((pt_B[1] - pt_C[1]) ** 2))

maxWidth = max(int(width_AD), int(width_BC))

height_AB = np.sqrt(((pt_A[0] - pt_B[0]) ** 2) + ((pt_A[1] - pt_B[1]) ** 2))

height_CD = np.sqrt(((pt_C[0] - pt_D[0]) ** 2) + ((pt_C[1] - pt_D[1]) ** 2))

maxHeight = max(int(height_AB), int(height_CD))

Now specify the mapping as shown in the image below (same image as above).

input_pts = np.float32([pt_A, pt_B, pt_C, pt_D])
output_pts = np.float32([[0, 0],
                        [0, maxHeight - 1],
                        [maxWidth - 1, maxHeight - 1],
                        [maxWidth - 1, 0]])

input_pts = np.float32([pt_A, pt_B, pt_C, pt_D])

output_pts = np.float32([[0, 0],

[0, maxHeight - 1],

[maxWidth - 1, maxHeight - 1],

[maxWidth - 1, 0]])

Now, compute the transformation matrix (M) using the cv2.getPerspectiveTransform() function as shown below

# Compute the perspective transform M
M = cv2.getPerspectiveTransform(input_pts,output_pts)

1 2	# Compute the perspective transform M M = cv2.getPerspectiveTransform(input_pts,output_pts)

After calculating the transformation matrix, apply the perspective transformation to the entire input image to get the final transformed image.

out = cv2.warpPerspective(img,M,(maxWidth, maxHeight),flags=cv2.INTER_LINEAR)

1	out = cv2.warpPerspective(img,M,(maxWidth, maxHeight),flags=cv2.INTER_LINEAR)

Below is the transformed image.

That’s all for this blog. In the next blog, we will discuss how to automatically choose the 4 points using the contours. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

0 Shares

TheAILearner

Mastering Artificial Intelligence

Perspective Transformation

What is Perspective Transformation?

OpenCV

Leave a ReplyCancel reply