Image Text Extraction and Recognition using Hybrid Approach of Region Based and Connected Component Methods

Image mining concerns the extraction of implicit knowledge, image data relationship or other patterns not explicitly stored in the images. Text in images is one of the powerful sources of high-level semantics. If these text occurrences could be detected, segmented, and recognized automatically, they would be a valuable source of high-level semantics for indexing and retrieval. The proposal in this work entitled as ―Image Text Extraction and Recognition Using Hybrid Approach of Region Based and Connected Component Methods‖ has been developed to detect, extract and recognize the text regions and the system is based on efficient edge detectors, connected component methods and optical character recognition. Text detection and extraction in images is important for content based image analysis. This problem is challenging due to the complex background, the non-uniform illumination, and the variations of text font, size and line orientation. The proposed method in this work develops an efficient text extraction and recognition methods that utilizes the concept of morphological operations using MATLAB. Existing text extraction methods – edge based and connected components produce better results when applied separately. But these methods produce more false positives. So it is proposed to take advantage of both methods and combine these methods in the proposed system. The result shows that the proposed methodology yields better results than the other two methods. Keywords—Text region detection, Clustering, Binarization, Segmentation, Recognition.

[1]  Sunil Kumar,et al.  Text Extraction and Document Image Segmentation Using Matched Wavelets and MRF Model , 2007, IEEE Transactions on Image Processing.

[2]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Basavaraj Amarapur,et al.  Video Text Extraction from Images for Character Recognition , 2006, 2006 Canadian Conference on Electrical and Computer Engineering.

[4]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..

[5]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[6]  David J. Crandall,et al.  Robust extraction of text in video , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[7]  B. Kapralos,et al.  I An Introduction to Digital Image Processing , 2022 .

[8]  Cheng-Lin Liu,et al.  Text Localization in Natural Scene Images Based on Conditional Random Field , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[9]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[10]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, CVPR 2004.

[11]  Xilin Chen,et al.  Automatic detection and recognition of signs from natural scenes , 2004, IEEE Transactions on Image Processing.

[12]  Cheng-Lin Liu,et al.  A Hybrid Approach to Detect and Localize Texts in Natural Scene Images , 2011, IEEE Transactions on Image Processing.

[13]  Li Xu,et al.  Automatic character detection and segmentation in natural scene images , 2007 .