What is OCR & why it makes your life easier
Optical character recognition, or OCR, defines the process of mechanically or electronically converting scanned images of handwritten, typed or printed text into machine-encoded text.
In this blog article, you’ll learn about:
- What the heck is OCR
- How does optical character recognition work – explained for non techies
- Why OCR is the new marketing gadget
Just keep on reading and you will get the answers you’re looking for and not end up confused.
Explaining a complex technology can end in a text that is horrible to read. A text full of technical vocabulary, confusing explanations and badly selected examples. Even though we can not describe OCR without using any terminology, we will try to keep them to a minimum. So the good news is that you do not need to be a hardcore techy to learn about what OCR is and how it works.
Add Anyline’s Mobile OCR Technology to your app today!
What is OCR?
As already mentioned OCR stands for optical character recognition. The technology deals with the problem of recognizing all different kinds of characters. Both handwritten and printed characters can be recognized and converted into machine readable text.
The technology deals with the problem of recognizing all different kinds of characters.
Think of any kind of serial number or code consisting of numbers and letters that you need digitized. By using OCR you can transform those codes into digital output. The technology makes use of different techniques. Put in a very simplified way, the image taken will be preprocessed and the characters extracted and recognized. I will get to the just mentioned techniques, a little later, but you can also jump right to it.
What OCR does not take into account is the actual nature of the object that you want to scan. It simply “takes a look” at the text that you aim to transform. If you want the device to recognize both the nature of the object as well as the text on it you need to combine different technologies. Take a look of what you can do combining both OCR and augmented reality for example.
If you want the device to recognize both the nature of the object as well as the text on it you need to combine different technologies.
Different techniques of OCR
Let’s have a look at three steps of optical character recognition: image preprocessing, character recognition itself and the post-processing of the output.
OCR software often preprocesses images to improve the chances of a successful recognition. The aim of image preprocessing is an improvement of the image data. Thus unwanted distortions are suppressed and specific image features are enhanced. Both of which are important for further processing.
For the actual character recognition part it is important to understand what feature extraction is. When the input data to an algorithm is too large to be processed, only a reduced set of features is selected. Those features selected are expected to be the important ones. The ones that are suspected to be redundant are sorted out. By using the reduced set of data instead of the initial large one, the performance will be better.
For the process of OCR this is important because the algorithm has to detect specific portions or shapes of a digitized image or video stream.
Post-processing is another error correction technique that ensures the high accuracy of OCR. The accuracy can be further improved if the output is restricted by a lexicon. That way the algorithm can fall back to a list of words that are allowed to occur in the scanned document for example.
Also depending on the application OCR is not only used for proper words, but also for numbers and codes.To better deal with different types of input OCR providers started to develop specific OCR systems. Those systems are able to deal with the special images. To further improve the recognition accuracy they combined various optimization techniques. For example they used business rules, standard expressions or rich information contained in color image. The strategy of merging various optimization techniques is called “application oriented OCR” or “customized OCR”. It is used in fields like business card OCR, invoice OCR or ID card OCR.
Possibilities using OCR
The possibilities using optical character recognition software are widespread. As already mentioned OCR can be combined with technologies like augmented reality for example. But the technology itself is already very powerful.
Here are a few examples of possible use cases including OCR software:
Passports and IDs have a machine readable zone (MRZ) that can be scanned. OCR can speed up the process of identifying and registering people at borders or other checkpoints. It thus is useful for immigration officers or other security personal.
There are a lot of innovative mobile marketing campaigns out there. Many companies make use of codes to engage their customers in a little competition.Think of all the voucher codes that customers can redeem by typing them in. Or numbers printed on the inside of a bottle cap that you need to collect. All those campaigns can make use of OCR by integrating the software in their often existing app. That way they minimize the hurdle of online registration and the process of typing in a series of numbers and letters.
Have a look at how Karlsberg used OCR in their marketing campaign:
The International Bank Account Number (IBAN) serves to identify bank accounts across borders. The IBAN may come in different length and can consist of numbers as well as letters. To ease cross border transactions banking apps can easily integrate OCR software. That way their customers can scan their IBAN instead of tediously typing it in.
There are a lot of optical character recognition softwares that specialize in one specific use case. For example credit card scanning, or document scanning. But OCR can be useful for so many different parts in our lives. Thus it is kind of annoying to use a different software for every different use case.
Tesseract is an open source OCR engine that has gained popularity among OCR developers. Even though it can be painful to implement and modify sometimes, there weren’t too many free and powerful OCR alternatives on the market for the longest time.
Anyline offers an OCR SDK that you can download for free as well and which, in contrast to Tesseract works perfectly on mobile.
Further useful links
- Which OCR technology is the best?
- Optical Character Recognition (OCR) – A branch Of Computer Vision
- Github – OCR Examples App Android
- Github – OCR Examples App iOS
- A free Tesseract font training tool