论文信息 - A novel method for binarization of scene text images and its application in text identification

A novel method for binarization of scene text images and its application in text identification

AbstractThe aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly available ICDAR 2011 Born Digital Data set. We introduce a new concept of variance map of a gray-level image for detection of text boundary in an image. Based on this boundary information, the image is binarized by means of adaptive thresholding. This binarization procedure produces a number of connected components. Next, these connected components are examined in order to identify possible text components. In this context, a number of shape-based features that distinguish between text and non-text components are proposed. We consider text component identification as an one-class classification problem, i.e., the ground truth information for only the text class is available for the ICDAR 2011 Born Digital Data set. Then, the ground truth text components are used to obtain a certain statistical distribution of the shape-based features. Here, we observe that all the features may not follow a single family of distributions. Therefore, we construct a joint distribution by using multivariate Gaussian copula which allows a coupling of different marginal distributions. As our experiments suggest, the copula-based method is superior to multivariate Gaussian distribution in describing the feature distribution. Finally, a text connected component of an unknown class is subjected to the trained statistical model, and by performing a hypothesis test we successfully identify a possible text component. For a comparative study, we consider a number of state-of-the-art methods. Our proposed approach significantly outperforms most of these methods in terms of recall, precision and F-measure in both the binarization and text identification tasks.

[1] Ujjwal Bhattacharya,et al. Devanagari and Bangla Text Extraction from Natural Scene Images , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[2] Dimosthenis Karatzas,et al. An on-line platform for ground truthing and performance evaluation of text extraction systems , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[3] Hsi-Jian Lee,et al. Binarization of color document images via luminance and saturation color features , 2002, IEEE Trans. Image Process..

[4] Edward M. Riseman,et al. TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[5] Xu-Cheng Yin,et al. Robust Text Detection in Natural Scene Images. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[6] Chew Lim Tan,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence, Manuscript Id a Laplacian Approach to Multi-oriented Text Detection in Video , 2022 .

[7] Partha Pratim Roy,et al. ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Text in Born-Digital Images (Web and Email) , 2011, 2011 International Conference on Document Analysis and Recognition.

[8] Dimosthenis Karatzas,et al. A Fine-Grained Approach to Scene Text Script Identification , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).

[9] Cheng-Lin Liu,et al. Lexicon-Driven Segmentation and Recognition of Handwritten Character Strings for Japanese Address Reading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10] JungHyun Han,et al. Text scanner with text detection technology on image sequences , 2002, Object recognition supported by user interaction for service robots.

[11] Anandarup Roy,et al. Decision Tree Based Recognition of Bangla Text from Outdoor Scene Images , 2011, ICONIP.

[12] Rainer Hoch,et al. On the evaluation of document analysis components by recall, precision, and accuracy , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[13] Horst Bunke,et al. Identification of text on colored book and journal covers , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[14] Deepak Kumar,et al. OTCYMIST: Otsu-Canny Minimal Spanning Tree for Born-Digital Images , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[15] Anil K. Jain,et al. Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[16] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Ioannis Pratikakis,et al. Improved document image binarization by using a combination of multiple binarization techniques and adapted edge information , 2008, 2008 19th International Conference on Pattern Recognition.

[18] Anabik Pal,et al. JCLMM: A finite mixture model for clustering of circular-linear data and its application to psoriatic plaque segmentation , 2017, Pattern Recognit..

[19] Rainer Lienhart,et al. Automatic text recognition in digital videos , 1995, Electronic Imaging.

[20] Feiyue Huang,et al. Automatic script identification in the wild , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[21] Matti Pietikäinen,et al. Adaptive document image binarization , 2000, Pattern Recognit..

[22] Jon Almazán,et al. ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[23] Shijian Lu,et al. Document image binarization using background estimation and stroke edges , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[24] Josep Lladós,et al. A framework for the assessment of text extraction algorithms on complex colour images , 2010, DAS '10.

[25] Bernd Freisleben,et al. Text detection in images based on unsupervised classification of high-frequency wavelet coefficients , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[26] Jun Zhang,et al. Multi-Orientation Scene Text Detection with Adaptive Clustering , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27] N. Otsu. A threshold selection method from gray level histograms , 1979 .

[28] David S. Doermann,et al. Automatic identification of text in digital video key frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[29] David S. Doermann,et al. Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[30] Fei Yin,et al. Handwritten Chinese Text Recognition by Integrating Multiple Contexts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31] R. Nelsen. An Introduction to Copulas , 1998 .

[32] Wayne Niblack,et al. An introduction to digital image processing , 1986 .