Optical Character Recognition

Eliteun applied Optical Character Recognition (OCR) mainly in license plate recognition and analog & digital hybrid meter reading system. The team self-developed a strong platform and multiple sub-algorithms to optimize recognition accuracy and efficiency of Eliteun systems, enabling them to support different license format and different meters with an accuracy of 99%.

OCR refers to both technology and the process of reading and converting text, characters into machine-encoded text or something that the computer can process.

The entire recognition process includes a serious of extensive algorithms, but it starts with a very important task – preprocess. Preprocess refers to the process of improving image data that suppresses unwanted features while enhancing target features for further recognition. It contains steps such as image rotation, grayscale, noise reduction, binarization, character detection, segmentation and normalization. If the shooting angle of an image is crooked, it needs to be rotated first. Its background colors and target information colors are processed in the step of grayscale. Any irrelevant parts such as speckles and lines are eliminated by filters and image regularizer in deep learning. This step is called noise reduction and has a direct impact on feature extraction. Then target characters are separated from background in binarization and segmented by character. All images of a single character are normalized on size and contrast, which makes it easier to apply unified algorithms for feature recognition.

In order to be extracted, the feature must be defined for classifiers to learn. Classifiers need to be trained over the time to be accurate. Thus the applications of OCR are different for different targets.