Detection in Images and Videos

The goal of a multimedia text extraction and recognition system is filling the gap between the already existing and mature technology of Optical Character Recognition and the new needs for textual information retrieval created by the spread of digital multimedia. A text extraction system from multimedia usually consists of the following four stages: spatial text detection, temporal text detection – tracking (for videos), image binarization – segmentation, character recognition. In the framework of this PhD thesis we dealt with all the stages of a multimedia text extraction system, focusing though on the designing and development of techniques for the spatial detection of text in images and videos as well as methods for evaluating the corresponding result. Two methods for the evaluation of the text detection result were proposed that deal successfully with the problems of the related literature. Each of them uses different criteria while both of them are based on intuitively correct observations. Finally, a very efficient method was developed for the temporal detection of text which actually conduces to a better spatial detection while concurrently enhances the quality of the text image.

[1]  Rongrong Ji,et al.  Directional correlation analysis of local Haar binary pattern for text detection , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[2]  Ellen K. Hughes,et al.  Video OCR for Digital News Archives , 1998 .

[3]  Ioannis Pratikakis,et al.  A two-stage scheme for text detection in video images , 2010, Image Vis. Comput..

[4]  Matti Pietikäinen,et al.  Adaptive document binarization , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[5]  David J. Crandall,et al.  Robust extraction of text in video , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[6]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[7]  Jean-Michel Jolion,et al.  Object count/area graphs for the evaluation of object detection and segmentation algorithms , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[8]  Basilios Gatos,et al.  A Pixel-Based Evaluation Method for Text Detection in Color Images , 2010, 2010 20th International Conference on Pattern Recognition.

[9]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Seong-Whan Lee,et al.  Text extraction in MPEG compressed video for content-based indexing , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[11]  Bin Chen,et al.  Recognition of handwritten Chinese characters via short line segments , 1992, Pattern Recognit..

[12]  Ahmet Ekin Local Information Based Overlaid Text Detection by Classifier Fusion , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[13]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Edward M. Riseman,et al.  TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Horst Bunke,et al.  Identification of text on colored book and journal covers , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[16]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[17]  Maurizio Pilu Using raw MPEG motion vectors to determine global camera motion , 1998, Electronic Imaging.

[18]  Datong Chen,et al.  Text enhancement with asymmetric filter for video OCR , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[19]  Wonjun Kim,et al.  A New Approach for Overlay Text Detection and Extraction From Complex Video Scene , 2009, IEEE Transactions on Image Processing.

[20]  Ioannis Pratikakis,et al.  A Hybrid System for Text Detection in Video Frames , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[21]  Frank Lebourgeois Robust multifont OCR system from gray level images , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[22]  Rainer Lienhart,et al.  Localizing and segmenting text in images and videos , 2002, IEEE Trans. Circuits Syst. Video Technol..

[23]  Hideaki Goto Redefining the DCT-based feature for scene text detection , 2008, International Journal of Document Analysis and Recognition (IJDAR).

[24]  Wolfgang Effelsberg,et al.  Automatic text segmentation and text recognition for video indexing , 2000, Multimedia Systems.

[25]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Hang Joon Kim,et al.  Support vector machine-based text detection in digital video , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[27]  Christopher Wolf,et al.  Model based text detection in images and videos: a learning approach , 2004 .

[28]  Jing Zhang,et al.  Framework for Performance Evaluation of Face, Text, and Vehicle Detection and Tracking in Video: Data, Metrics, and Protocol , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[30]  Masaru Sugano,et al.  Moving-object detection from MPEG coded data , 1998, Electronic Imaging.

[31]  Kongqiao Wang,et al.  Character location in scene images from digital camera , 2003, Pattern Recognit..

[32]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.

[33]  Qifeng Liu,et al.  A stroke filter and its application to text localization , 2009, Pattern Recognit. Lett..

[34]  Keechul Jung,et al.  Neural network-based text location in color images , 2001, Pattern Recognit. Lett..

[35]  Jun Huang,et al.  Text detection and restoration in natural scene images , 2007, J. Vis. Commun. Image Represent..

[36]  David S. Doermann,et al.  Text enhancement in digital video using multiple frame integration , 1999, MULTIMEDIA '99.

[37]  Ioannis Pratikakis,et al.  Detection of artificial and scene text in images and video frames , 2013, Pattern Analysis and Applications.

[38]  David J. Crandall,et al.  A system for automatic text detection in video , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[39]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[40]  Jean-Marc Odobez,et al.  Text segmentation and recognition in complex background based on Markov random field , 2002, Object recognition supported by user interaction for service robots.

[41]  Mohamed S. Kamel,et al.  Extraction of Binary Character/Graphics Images from Grayscale Document Images , 1993, CVGIP Graph. Model. Image Process..

[42]  Jean-Philippe Thiran,et al.  A localization/verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods , 2004, Signal Process. Image Commun..

[43]  B. Kapralos,et al.  An introduction to digital image processing , 1990 .

[44]  Atreyi Kankanhalli,et al.  Automatic Extraction of Characters in Complex Scene Images , 1995, Int. J. Pattern Recognit. Artif. Intell..

[45]  Dmitry B. Goldgof,et al.  Performance Evaluation of Text Detection and Tracking in Video , 2006, Document Analysis Systems.

[46]  David J. Crandall,et al.  Extraction of special effects caption text events from digital video , 2003, International Journal on Document Analysis and Recognition.

[47]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[48]  Xilin Chen,et al.  Automatic detection and recognition of signs from natural scenes , 2004, IEEE Transactions on Image Processing.