Implementation of devanagri character recognition system through pattern recognition techniques

Optical Character Recognition is the strategy of taking images or photos of letters or typewritten content and altering over them into information that a machine can easily understand, e.g. organizations and libraries taking physical duplicates of books, magazines, or other old printed material and utilizing OCR to put them into computers. OCR incorporates of preprocessing, segmentation, feature extraction and classification & recognition. This paper presents the technique of segmentation of words from Devanagari script. Image is scanned using flatbed scanner and binarized, to convert it in 0's and 1's format. Some distortions are removed using median filter on binarized image. For detecting lines from the image horizontal histogram is obtained. Segmentation is the major step of OCR. Segmentation of script is essential for handwritten script recognition. Segmentation affects recognition, so accurate segmentation is important for implementing OCR. Segmentation of handwritten words is a bit intricate task as the shape of the handwritten characters is uncertain due to variability in writing styles. Segmentation and feature extraction are the main gears of script recognition.