How we simplified OCR Parameters
Optical Character Recognition, or OCR, is a complex topic. It consists of many different parts such as preprocessing of the image, font training, and most of all testing and developing. If you want to refresh your knowledge on OCR, check out our article about What is OCR and why it makes your life easier. Our developers work on these different and complex parts all the time. They test, develop, test and develop to make text recognition better with every day.
In particular, they spend a lot of their time making our Anyline OCR SDK Module better.
This module allows you to build an OCR use case of your own. It allows you to scan voucher codes, an IBAN, serial numbers and many other things! With it, we designed a system that automatically adjusts all preprocessing, OCR and validation parameters according to just a few simple settings that you can set in the SDK to raise the scan quality!
In the following, I will guide you through these parameters so you know how you can adapt everything to your needs in no time!
scanMode: How are the characters arranged?
Line: The LINE mode is optimal for scanning one or more lines of variable length and/or font (like IBANs or addresses).
Grid: The GRID mode is optimal for characters with equal size laid out in a grid, with a constant font, background and character count.
charHeight: Which height do the characters have?
minCharHeight: As you can see below, the minCharHeight defines the minimum height of a character to scan in pixels.
maxCharHeight: As you can see below the maxCharHeight defines the maximum height of a character to scan in pixels.
validationRegex: Which format do the characters have?
The validationRegex defines a Regular Expression which the detected result is validated against.
Valid Regex: A regex string to validate the result. Invalid results will not be returned
Invalid Regex: A regex string to validate the result. Invalid results will not be returned
tesseractLanguages: Which languages are allowed?
The OCR part of the SDK relies on so called traineddata files, which are specific to a font and language. This parameter tells the module which traineddata file to use when performing the OCR.
Generic: The generic languages to use for the OCR.
Customized: For using a customized training file set the value to “custom”.
charWhiteList: Which characters are allowed?
The charWhiteList defines a whitelist of characters that are allowed in a result. Setting this parameter thoroughly has the benefit that the accuracy of the result will be improved and it if you use it together with the validationRegex, it will prevent you from getting incorrect results.
Additional Settings in Line Mode
When you are working with the Line Mode, there are additional parameters that help improve your result. removeSmallContours defines that small contours in the text will not be considered during the scanning process if set to true.
minSharpness defines a minimum sharpness that is required of the image to be processed further in the SDK.
removeWhitespaces will remove all the whitespace if set to true.
Additional Settings in Grid Mode
CharCountX / CharCountY defines the number of symbols in horizontal or vertical direction in the grid.
CharPaddingXFactor / CharPaddingYFactor Defines the average horizontal or vertical distance between two characters, measured in percentage of the characters width.
isBrightTextOnDark means that the SDK looks for bright symbols on a dark background. If you set it to false, the SDK looks for dark symbols on a bright background.
I hope this was well understandable for you. If there’s still something unclear, you can check out our documentation or just reach out to us! If you need further help with the implementation of the Anyline SDK for your own use case, check out the videos on How to integrate the Anyline SDK for Android and How to Import the Anyline Example Project into Android Studio!
If you’ve read this and you’re curious about the Anyline SDK, go ahead: download it and try it out for free!
QUESTIONS? LET US KNOW!