OCR for Indian language

Optical character recognition- OCR

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used as a form of data entry from some sort of original paper data source, whether documents, sales receipts, mail, or any number of printed records. It is a common method of digitizing printed texts so that they can be electronically searched, stored more compactly, displayed on-line, and used in machine processes such as machine translation, text-to-speech and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.
Handwriting recognition (or HWR) is the ability of a computer to receive & interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices.
The ultimate objective of any Optical Character Recognition (OCR) system is to simulate the human reading capabilities.
Methods and recognition rates depend on the level of constraints on handwriting.
The constraints are mainly characterized by the

types of handwriting
number of scripters
size of the vocabulary
spatial layout.

Categories of recognition Technique

Online recognition Approach:

Offline recognition Approach:

Offline Handwriting Recognition

A difficult problem…
Almost as many approaches as there are researchers
Different Methods for optical character recognition technique

Pattern Recognition
Neural networks / machine learning
Mathematical modelling
Fuzzy logic method
Sub-graph matching / graph search
Fractal image compression
Correlation Method

The OCR system based on the three main stages.

Correlation Method For OCR

Preprocessing.

In most of the method this step is quit common. This is the very important step for any OCR system, because in this step the noise and other unwanted signs are removed by image processing. By using appropriate method for preprocessing we can get image in it's most cleaned form.
Let's discuss the steps for very basic preprosessing technique.

First we convert the color image into the gray scale image. In color image , it has three dimension for color value RGB (red , blue , green). It is easy to process gray scale image than color image. So we convert it into the gray scale image.

There are many tools for image processing. But we use the most common tool matlab for this. For this there is a built in function in matlab to convert image into Grayscale image. let's see the example of it.

I = imread('board.tif');

J = rgb2gray(I);

figure, imshow(I), figure, imshow(J);