How Anyline Managed to Simplify OCR Parameters
Optical Character Recognition, or OCR, is a complex topic. It consists of many different parts such as preprocessing of the image, font training, and most of all testing and developing. If you want to refresh your knowledge on OCR, check out our article “What is OCR?”. Our developers work on these different and complex parts all the time. They test, develop, test, and develop to make text recognition better every day.
In particular, they spend a lot of their time making our Anyline mobile OCR SDK Module better.
This module allows you to build an OCR use case of your own. It allows you to scan voucher codes, an IBAN, serial numbers, and many other things! With it, we designed a system that automatically adjusts all preprocessing, OCR, and validation parameters according to just a few simple settings that you can set in the SDK to raise the scan quality.
In the following, I will guide you through these parameters so you know how you can adapt everything to your needs in no time.
OCR Parameters
scanMode: How are the characters arranged?
Line: The LINE mode is optimal for scanning one or more lines of variable length and/or font (like IBANs or addresses).
Grid: The GRID mode is optimal for characters with equal size laid out in a grid, with a constant font, background, and character count.
charHeight: Which height do the characters have?
minCharHeight: As you can see below, the minCharHeight defines the minimum height of a character to scan in pixels.
maxCharHeight: As you can see below the maxCharHeight defines the maximum height of a character to scan in pixels.
validationRegex: Which format do the characters have?
The validationRegex defines a Regular Expression in which the detected result is validated against.
Valid Regex: A regex string to validate the result. Invalid results will not be returned
Example: IBAN Scanning
Invalid Regex: A regex string to validate the result. Invalid results will not be returned
tesseractLanguages: Which languages are allowed?
The OCR part of the SDK relies on so-called traineddata files, which are specific to a font and language. This parameter tells the module which trained data file to use when performing the OCR.
Generic: The generic languages to use for the OCR.
Customized: For using a customized training file set the value to “custom”.
charWhiteList: Which characters are allowed?
The charWhiteList defines a whitelist of characters that are allowed in a result. Setting this parameter thoroughly has the benefit that the accuracy of the result will be improved and if you use it together with the validationRegex, it will prevent you from getting incorrect results.
Additional Settings in Line Mode
When you are working with the Line Mode, there are additional parameters that help improve your result. removeSmallContours defines that small contours in the text will not be considered during the scanning process if set to true.
minSharpness defines a minimum sharpness that is required of the image to be processed further in the SDK.
removeWhitespaces will remove all the whitespace if set to true.
Additional Settings in Grid Mode
CharCountX / CharCountY defines the number of symbols in a horizontal or vertical direction in the grid.
CharPaddingXFactor / CharPaddingYFactor Defines the average horizontal or vertical distance between two characters, measured in percentage of the character’s width.
isBrightTextOnDark means that the SDK looks for bright symbols on a dark background. If you set it to false, the SDK looks for dark symbols on a bright background.
I hope this was well understandable for you. If there’s still something unclear, you can check out the Anyline Documentation or just reach out to us!
If you’ve read this and you’re curious about the Anyline mobile scanning SDK, go ahead and get hands-on experience with our demo app.