Optical Character Recognition: Introduction and its Applications

Hello! and welcome to this series on Optical Character Recognition, also known as OCR in short. Most of you already might be familiar with this OCR term, if not, no worries, we will be discussing everything in detail in this series. So, in this blog, let’s first start by giving you an OCR introduction followed by some motivation, that why you should invest your time in learning this. So, let’s get started.

What is OCR?

Optical character recognition is a method of converting the text present in images or scanned documents to a machine-readable format that can later be edited, searched, and used for further processing.

The term machine-readable format means the text in electronic form or simply the text that you can select, edit, process, etc. Let’s take an example to understand what this actually means.

Suppose we are given the below image. Clearly, as we can see, there is some text present in the image. But for a computer, this is nothing but an array of pixel values.

The computer doesn’t know whether the image contains text, car, or bus, etc. We can’t select, edit, or do any further processing on the text. Thus, this is not a machine-encoded text.

So, what the OCR system will do is, digitize the printed text, that is, take this image as an input, and outputs a text file containing all the text present in the image. Now, you can do anything with the text you want.

So, now you know coarsely what an OCR is. In the next section, let’s try to understand why you should invest your time in learning this?

Note: For more details on the Optical Character Recognition , please refer to the Mastering OCR using Deep Learning and OpenCV-Python course.

Applications

Let’s take a few real-life examples where OCR has made our life easy. You already might have encountered these in your day-to-day life but might not have thought about how these work.

Automatic Data Entry

This is one of the most prominent applications of OCR. Earlier people used to manually enter the details from business documents, invoices, passports, receipts, etc. But now with the help of OCR, most of these tasks are now automated. Also instead of managing a colossal pile of paper documents, now everything is archived digitally.

For instance, in banks instead of manually entering the cheque details, the cheque is first scanned, and then the OCR extracts all the useful information such as account number, amount, etc. thus leading to faster processing. Similarly, at airports, your passport information is extracted from the Machine-readable zone (MRZ) leading to faster processing.

Vehicle Number Plate Recognition

Almost everyone might have seen or heard about this application. OCR is used to recognise the vehicle registration plate which can then be used for vehicle tracking, toll collection, etc. This was invented in 1976 in Britain but became popular only after 90’s.

Self-driving cars

Most of you might be thinking of how and where the OCR is used in self-driving cars. The answer is recognizing the traffic signs. The autonomous car uses OCR to recognize the traffic signs and thus take actions accordingly. Without this, the self-driving car will pose a risk for both pedestrians and other vehicles on road.

Book Scanning

OCR is widely used in digitizing scanned documents. For instance, you might have heard about Project Gutenberg that tries to digitize and archive cultural works. Most of these items are available free of cost. Similarly, Google Books scans books, converts them to text using OCR, and stores them in its digital database.

For Visually Impaired persons

In this, we can use OCR to extract the text and use text-to-speech to read the extracted text. This approach was first used around 1976.

Your Personal Translator

Suppose you are roaming in a country that doesn’t speak your language. You find a signboard that you are not able to understand. Obviously, you can ask someone. But OCR can also help you out in this situation. Just click a photo of that signboard, run the OCR (constraint: language must be known), extract the text and then use google translate or any other API to translate it to your native language. Isn’t this cool!!!

These are just a few of many OCR applications. From these applications, we can see that because of OCR most of the work has now been automated which helps in saving time, money, manpower, etc. Hope these applications have motivated you enough to learn OCR. From the next blog, we will start discussing how the OCR works. Till then, have a great time. Hope you enjoy reading.

If you have any doubts/suggestions please feel free to ask and I will do my best to help or improve myself. Good-bye until next time.

Leave a Reply