Text detection on camera acquired document images using supervised classification of connected components in wavelet domain

In this paper we present an algorithm to detect text on video frames consisting of lecture slides. We begin by performing a multi-channel wavelet transform and then merge the channel components for the high frequency sub bands to obtain a composite energy map. Thresholding the energy map results in an edge map consisting of candidate text pixels - some of these correspond to actual text and others correspond to graphics, logo, tables, etc. The connected components in the edge map are then filtered to reject some of the false positives using a trained classifier. Rectangular text blocks compactly surrounding the text regions are then identified using a process of selective dilation and recursive splitting. False positive text blocks still remaining are then rejected using heuristics. Experiments conducted on 890 images show that our scheme has lower false positive rate and misdetection rate when compared with two existing scene text detection methods.

[1]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2]  Bernd Freisleben,et al.  Text detection in images based on unsupervised classification of high-frequency wavelet coefficients , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[3]  Palaiahnakote Shivakumara,et al.  New Wavelet and Color Features for Text Detection in Video , 2010, 2010 20th International Conference on Pattern Recognition.

[4]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Hiroaki Kobayashi,et al.  Text detection in color scene images based on unsupervised clustering of multi-channel wavelet features , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[7]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[8]  Palaiahnakote Shivakumara,et al.  A Robust Wavelet Transform Based Technique for Video Text Detection , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[9]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[10]  P. Nagabhushan,et al.  Foreground text segmentation in complex color document images using Gabor filters , 2012, Signal Image Video Process..