Devanagari text extraction from natural scene images

In scenic images, information in the form of text provides vital clues for most applications based on image processing. These include assisted navigation content based image retrieval, automatic geocoding and understanding the scene. But in a multicolored complex background, it is quite a daunting task to locate the text. This task is daunting because of non-uniformity in illumination, complexity of the backdrop, and differences in the size font & line-orientation of the text. We propose a novel approach for Devanagari text extraction from natural scene images in this paper. We can use a text-to-speech engine or Optical Character Reader to recognize the extracted text. The basis of our scheme is to analyze the CCs. This is done to extract Devanagari text from scenic images captured by camera. The presence of head line is unique to this script. Our scheme makes use of mathematical morphological operations to extract the headlines. Also the binarization of scenic images was studied. Here the effectiveness of the adaptive thresholding approach was observed. The algorithm was tested on Devanagari text contained within a collection of 100 scenic images.

[1]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[2]  Yunde Jia,et al.  Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images , 2008, Pattern Recognit..

[3]  Satoshi Goto,et al.  A Contour-Based Robust Algorithm for Text Detection in Color Images , 2006, IEICE Trans. Inf. Syst..

[4]  Bernd Freisleben,et al.  Text detection in images based on unsupervised classification of high-frequency wavelet coefficients , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[5]  Nobuo Ezaki,et al.  Text detection from natural scene images: towards a system for visually impaired persons , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[6]  Christian Callegari,et al.  Advances in Computing, Communications and Informatics (ICACCI) , 2015 .

[7]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[8]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, CVPR 2004.

[9]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[10]  A. McCallum,et al.  Sign detection in natural images with conditional random fields , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[11]  JungHyun Han,et al.  Text scanner with text detection technology on image sequences , 2002, Object recognition supported by user interaction for service robots.

[12]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).