A New Text Extraction Method Incorporating Local Information

Text detection and extraction in images with complex background can provide useful information for video annotation and indexing. More attention is paid to text detection for its importance, but text extraction is necessary for the text recognition, and it can test the validity of text detection. In this paper, we conclude text extraction is to segment the image and to remove noises, and then a robust text extraction method incorporating local information is proposed. First, we get the gray image from the original image and reprocess the gray image with edge enhancement. Then a binarization method incorporating local information is used to segment the gray image, by which the text-noises are removed and a binary image is obtained. Finally, the connected component analysis based on the character's density and geometric feature is performed on the binary image, by which background-noises are removed. The preliminary experiments show some promising results.

[1]  Richard Gran,et al.  On the Convergence of Random Search Algorithms In Continuous Time with Applications to Adaptive Control , 1970, IEEE Trans. Syst. Man Cybern..

[2]  Christopher R. Dance,et al.  Binarising camera images for OCR , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[3]  Ge Yu,et al.  An efficient iterative algorithm for image thresholding , 2008, Pattern Recognit. Lett..

[4]  S.M. Szilagyi,et al.  MR brain image segmentation using an enhanced fuzzy C-means algorithm , 2003, Proceedings of the 25th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEEE Cat. No.03CH37439).

[5]  Youngsu Moon,et al.  Text segmentation based on stroke filter , 2006, MM '06.

[6]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Daoqiang Zhang,et al.  Fast and robust fuzzy c-means clustering algorithms incorporating local information for image segmentation , 2007, Pattern Recognit..

[8]  B. Gosselin,et al.  Combination of binarization and character segmentation using color information , 2004, Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004..

[9]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[10]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[11]  S. H. Kim,et al.  Text region extraction and text segmentation on camera-captured document style images , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).