A Caption Text Detection Method from Images/Videos for Efficient Indexing and Retrieval of Multimedia Data

Textual information embedded in multimedia can provide a vital tool for indexing and retrieval. Text extraction process has many inherent problems due to the variation in font sizes, color, backgrounds and resolution. Text detection and localization are the most challenging phases of text extraction process whereas text extraction results are highly dependent upon these phases. This paper focuses on the text localization because of its very fundamental importance. Two effective feature vectors are introduced for the classification of the text and nontext objects. First feature vector is represented by the Radon transform of text candidate objects. Second feature vector is derived from the detailed geometrical analysis of text contents. Union of two feature vectors is used for the classification of text and nontext objects using support vector machine (SVM). Text detection and localization results are evaluated on two publicly available datasets namely ICDAR 2013 and IPC-Artificial text. Moreover, results are compared with state-of-the-art techniques and the Comparison demonstrates the superiority of the presented research.

[1]  Jin Hyung Kim,et al.  Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  M. C. Ortiz,et al.  Selecting variables for k-means cluster analysis by using a genetic algorithm that optimises the silhouettes , 2004 .

[3]  Rainer Lienhart,et al.  VIDEO OCR: A SURVEY AND PRACTITIONER'S GUIDE , 2003 .

[4]  Jorge Stolfi,et al.  T-HOG: An effective gradient-based descriptor for single line text regions , 2013, Pattern Recognit..

[5]  Ming Zhao,et al.  Text detection in images using sparse representation with discriminative dictionaries , 2010, Image Vis. Comput..

[6]  Jean-Michel Jolion,et al.  Object count/area graphs for the evaluation of object detection and segmentation algorithms , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[7]  Chew Lim Tan,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence, Manuscript Id a Laplacian Approach to Multi-oriented Text Detection in Video , 2022 .

[8]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[9]  Yunde Jia,et al.  Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images , 2008, Pattern Recognit..

[10]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[11]  S. Dudoit,et al.  A prediction-based resampling method for estimating the number of clusters in a dataset , 2002, Genome Biology.

[12]  Xian-Sheng Hua,et al.  An automatic performance evaluation protocol for video text detection algorithms , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Ioannis Pratikakis,et al.  A two-stage scheme for text detection in video images , 2010, Image Vis. Comput..

[14]  Marti A. Hearst Trends & Controversies: Support Vector Machines , 1998, IEEE Intell. Syst..

[15]  Chang Hong Lin,et al.  A robust video text detection approach using SVM , 2012, Expert Syst. Appl..

[16]  Catherine A. Sugar,et al.  Finding the Number of Clusters in a Dataset , 2003 .

[17]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..

[18]  Hua Li,et al.  Non-rigid Body Interpolation Based on Generalized Morphologic Morphing , 2003, Int. J. Image Graph..

[19]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[20]  Gregory Beylkin,et al.  Discrete radon transform , 1987, IEEE Trans. Acoust. Speech Signal Process..

[21]  T. Santhanam,et al.  A SURVEY ON VARIOUS APPROACHES OF TEXT EXTRACTION IN IMAGES , 2012 .

[22]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Rui Seara,et al.  Image segmentation by histogram thresholding using fuzzy sets , 2002, IEEE Trans. Image Process..

[24]  Chunheng Wang,et al.  Scene text detection using graph model built upon maximally stable extremal regions , 2013, Pattern Recognit. Lett..