Tesseract versus Anyline

Tesseract versus Anyline

More and more developers are searching for solutions to easily implement OCR features into their mobile applications. Since Google released Tesseract as an Open Source OCR Engine, it’s been the go-to OCR solution. Although it was painful to implement and modify, there weren’t too many free and powerful OCR alternatives on the market. Despite the lack of proper documentation, difficult implementation on mobile devices and the need to pre-process all images, Tesseract gained popularity among OCR developers.
It’s time however to make OCR implementation a bit less painful and a bit more fun. With the Anyline® SDK, you receive more support, automated image pre-processing and the flexibility to develop for a variety of use cases.

Below you can find more specific differences, but don’t take our word for it. Download the Anyline® SDK and the AnyOCR Font and start implementing your own OCR feature and see for yourself!

Overview

TESSERACTANYLINE
Multi-Platform Support
Easy Integration
Out of the Box Solution
Detailed Documentation
Automatic Image Pre-Processing
Explicitly for Mobile
Easy Customization

Tesseract

Multi-platform Support

Tesseract and Anyline can both be integrated on multiple platforms like iOS, Android or Cordova. They both can be customized for various use cases with similar results. However, Tesseract requires a deeper understanding of image processing and knowledge of Tesseractparameters in order to fine tune the outcome of the OCR engine.

Anyline

Multi-platform Support

Anyline is available on several platforms, such as iOS, Android, Cordova and Xamarin. This gives developers the option to download the SDK on their platform of choice. The SDK for iOS and Android can also be unpacked through Cocoapods or Maven, respectively.

Preparation Needed

The first steps in getting Tesseract and using it, look easy at an initial glance. There’s a large community of developers who are constantly improving the tool. However, Tesseract takes a lot time to implement and still lacks many fine-tuned features, like the pre-processing of images.

No Preparation

Just start – there’s nothing more to say!

Integration

The first steps in getting Tesseract and using it, look easy at an initial glance. There is a large community of developers who are constantly improving the tool. However, Tesseract still lacks many fine-tuned features which make one’s lives easier, and which take much time to implement, for example the pre-processing of the images.

Integration

The biggest advantage you can have in development is speed. With Anyline, you just download the SDK, integrate it in your app, and you’re ready to go. Regardless of platform, iOS, Android, Xamarin or Windows, from smartphones to smartglasses, Anyline offers preconfigured modules for various use-cases that you can simply ‘plug-and-play’.

No Unified Documentation

The Tesseract documentation comprises of an extensive API, but few direct contributions and is missing official examples and setup guides. It’s mainly an amalgam of links to third party articles or snippets of code. Sifting through the documentation can be tedious and time consuming.

Detailed Documentation

In order to understand all the single code lines we wrote and the reasoning behind it, we put together the Anyline Documentation, where you can find all necessary steps to integrate the Anyline® SDK. There are also comprehensive notes and comments within the SDK, including the Anyline Example App. This will show you what’s possible with Anyline and guide you through the integration process.

Own Pre-Processing of the image required

Tesseract can do some image processing internally by using the Leptonica library, but the results can be inaccurate. For the best results, images should be under 300 DPI which makes high resolution images difficult to pre-process. Tesseract can only give you the best results when it’s provided, and we quote ‘crystal clear black text on a pure white background’.

No Image Pre-Processing necessary

One of the biggest pains of OCR is image pre-processing. Smartphones cannot see and interpret things as humans do, so all the images fed to them have to be pre-processed. This involves getting rid of unnecessary information so images are in the best and simplest quality possible. With Tesseract this would be done by you, but with Anyline we already do this for you.

Not Mobile Friendly

When Tesseract was developed by Hewlett Packard in the 80s, mobile phones were still uncommon commodities the size of a brick. For this reason, not much development went into creating a futuristic OCR for smartphones. This doesn’t mean you can’t use tesseract for mobile, it’s  just much more complicated. Also Tesseract is only available for Linux, Windows and Mac OS, although it’s not so rigorously tested on the latter.

Explicitly for Mobile

The more you focus on something, the better you get at it. Since the future of technology is strongly heading towards an almost all mobile world, we focused on creating the best mobile OCR technology that we can.

Difficult Customization

In version 3.02, Tesseract had 648 parameters which could be used to tune the output. Those options were in most cases poorly documented and not named in any self-explanatory way, regarding the effect they would have on the OCR outcome. So fine-tuning Tesseract is a long process of trial and error, where configuration of one parameter will fix one problem while causing another.

Easily Customizable for Various Use Cases

Many of the use cases we cover in the pre-configured modules came through users who integrated the Anyline® SDK and configured it for themselves. Now we can offer these configurations to the next batch of great innovators.

You actually want to train a different font for Tesseract?

…We have the perfect solution for you!