In this blog, we will discuss image translation one of the most basic geometric transformations, that is performed on images. So, let’s get started.
Translation is simply the shifting of object location. Suppose we have a point P(x,y) which is translated by (tx, ty), then the coordinates after translation denoted by P'(x’,y’) are given by
So, we just need to create the transformation matrix (M) and then we can translate any point as shown above. That’s the basic idea behind translation. So, let’s first discuss how to do image translation using numpy for better understanding, and then we will see a more sophisticated implementation using OpenCV.
Numpy
First, let’s create the transformation matrix (M). This can be easily done using numpy as shown below. Here, the image is translated by (100, 50)
1 |
M = np.float32([[1,0,100],[0,1,50]]) |
Next, let’s convert the image coordinates to the form [x,y,1]. This can be done as
1 2 3 4 5 |
# get the coordinates in the form of (0,0),(0,1)... # the shape is (2, rows*cols) orig_coord = np.indices((cols, rows)).reshape(2,-1) # stack the rows of 1 to form [x,y,1] orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols))) |
Now apply the transformation by multiplying the transformation matrix with coordinates.
1 2 3 |
transform_coord = np.dot(M, orig_coord_f) # Change into int type transform_coord = transform_coord.astype(np.int) |
Keep only the coordinates that fall within the image boundary.
1 |
indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0) |
Now, create a zeros image similar to the original image and project all the points onto the new image.
1 2 |
img1 = np.zeros_like(img) img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]] |
Display the final image.
1 2 3 |
out = cv2.hconcat([img,img1]) cv2.imshow('a1',out) cv2.waitKey(0) |
The full code can be found below
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
import numpy as np import cv2 # Read an image img = cv2.imread('D:/downloads/opencv_logo.PNG') rows,cols,_ = img.shape # Create the transformation matrix M = np.float32([[1,0,100],[0,1,50]]) # get the coordinates in the form of (0,0),(0,1)... # the shape is (2, rows*cols) orig_coord = np.indices((cols, rows)).reshape(2,-1) # stack the rows of 1 to form [x,y,1] orig_coord_f = np.vstack((orig_coord, np.ones(rows*cols))) transform_coord = np.dot(M, orig_coord_f) # Change into int type transform_coord = transform_coord.astype(np.int) # Keep only the coordinates that fall within the image boundary. indices = np.all((transform_coord[1]<rows, transform_coord[0]<cols, transform_coord[1]>=0, transform_coord[0]>=0), axis=0) # Create a zeros image and project the points img1 = np.zeros_like(img) img1[transform_coord[1][indices], transform_coord[0][indices]] = img[orig_coord[1][indices], orig_coord[0][indices]] # Display the image out = cv2.hconcat([img,img1]) cv2.imshow('a2',out) cv2.waitKey(0) |
Below is the output. Here, left image represents the original image while the right one is the translated image.
OpenCV-Python
Now, let’s discuss how to translate images using OpenCV-Python.
OpenCV provides a function cv2.warpAffine() that applies an affine transformation to an image. You just need to provide the transformation matrix (M). The basic syntax for the function is given below.
1 2 3 4 5 6 7 |
dst = cv.warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]] ) # src: input image # M: Transformation matrix # dsize: size of the output image # flags: interpolation method to be used |
Below is a sample code where the image is translated by (100, 50).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
import numpy as np import cv2 # Read an image img = cv2.imread('D:/downloads/opencv_logo.PNG') rows,cols,_ = img.shape # Create the transformation matrix M = np.float32([[1,0,100],[0,1,50]]) # Pass it to warpAffine function dst = cv2.warpAffine(img,M,(cols,rows)) # Display the concatenated image out = cv2.hconcat([img, dst]) cv2.imshow('img',out) cv2.waitKey(0) |
Below is the output. Here, left image represents the original image while the right one is the translated image.
Compare the outputs of both implementations. That’s all for image translation. In the next blog, we will discuss another geometric transformation known as rotation in detail. Hope you enjoy reading.
If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.