A new method for multi-oriented graphics-scene-3D text classification in video

Text detection and recognition in video is challenging due to the presence of different types of texts, namely, graphics (video caption), scene (natural text), 2D, 3D, static and dynamic texts. Developing a universal method that works well for all the types is hard. In this paper, we propose a novel method for classifying graphics-scene and 2D-3D texts in video to enhance text detection and recognition accuracies. We first propose an iterative method to classify static and dynamic clusters based on the fact that static texts have zero velocity while dynamic texts have non-zero velocity. This results in text candidates for both static and dynamic texts regardless of 2D and 3D types. We then propose symmetry for text candidates using stroke width distances and medial axis values, which results in potential text candidates. We group potential text candidates using their geometrical properties to form text regions. Next, for each text region, we study the distribution of the dominant medial axis values given by ring radius transform in a new way to classify graphics and scene texts. Similarly, we study the proximity among the pixels that satisfy the gradient directions symmetry to classify 2D and 3D texts. We evaluate each step of the proposed method in terms of classification and recognition rates through classification with the existing methods to show that video text classification is effective and necessary for enhancing the capability of current text detection and recognition systems. We propose a novel method for classifying graphics-scene and 2D-3D texts in video.An iterative procedure to identify text candidates is presented.Stroke width and medial axis are explored for classifying graphics and scene texts.Gradient directions and medial axis are combined for classifying 2D and 3D texts.

[1]  Huadong Ma,et al.  Automatic Detection and Localization of Natural Scene Text in Video , 2010, 2010 20th International Conference on Pattern Recognition.

[2]  Deepu Rajan,et al.  New Edge Characteristics for Scene and Object Classification , 2012, Int. J. Pattern Recognit. Artif. Intell..

[3]  Palaiahnakote Shivakumara,et al.  New Gradient-Spatial-Structural Features for video script identification , 2015, Comput. Vis. Image Underst..

[4]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[5]  Matti Pietikäinen,et al.  Adaptive document binarization , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  Andrew Zisserman,et al.  MLESAC: A New Robust Estimator with Application to Estimating Image Geometry , 2000, Comput. Vis. Image Underst..

[7]  David S. Doermann,et al.  Progress in camera-based document image analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[8]  Palaiahnakote Shivakumara,et al.  Detection of Curved Text in Video: Quad Tree Based Method , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[9]  Tsung-Han Tsai,et al.  2DVTE: A two-directional videotext extractor for rapid and elaborate design , 2009, Pattern Recognit..

[10]  Jorge Stolfi,et al.  T-HOG: An effective gradient-based descriptor for single line text regions , 2013, Pattern Recognit..

[11]  Shijian Lu,et al.  Multioriented Video Scene Text Detection Through Bayesian Classification and Boundary Growing , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Chew Lim Tan,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence, Manuscript Id a Laplacian Approach to Multi-oriented Text Detection in Video , 2022 .

[13]  Yang Liu,et al.  A Novel Multi-oriented Chinese Text Extraction Approach from Videos , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[14]  David S. Doermann,et al.  Machine printed text and handwriting identification in noisy document images , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[16]  Walid Mahdi,et al.  AViTExt: Automatic Video Text Extraction; A new Approach for video content indexing Application , 2008, 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications.

[17]  Palaiahnakote Shivakumara,et al.  A New Method for Arbitrarily-Oriented Text Detection in Video , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[18]  Palaiahnakote Shivakumara,et al.  Graphics and Scene Text Classification in Video , 2014, 2014 22nd International Conference on Pattern Recognition.

[19]  Palaiahnakote Shivakumara,et al.  Recognition of Video Text through Temporal Integration , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[20]  Jin Hyung Kim,et al.  Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[22]  Yonatan Wexler,et al.  Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Palaiahnakote Shivakumara,et al.  Accurate video text detection through classification of low and high contrast images , 2010, Pattern Recognit..

[24]  Jorge Stolfi,et al.  SnooperText: A text detection system for automatic indexing of urban scenes , 2014, Comput. Vis. Image Underst..

[25]  Palaiahnakote Shivakumara,et al.  2D and 3D Video Scene Text Classification , 2014, 2014 22nd International Conference on Pattern Recognition.

[26]  Palaiahnakote Shivakumara,et al.  Separation of Graphics (Superimposed) and Scene Text in Video Frames , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[27]  Jean-Marc Odobez,et al.  Video text recognition using sequential Monte Carlo and error voting methods , 2005, Pattern Recognit. Lett..

[28]  Palaiahnakote Shivakumara,et al.  Wavelet-gradient-fusion for video text binarization , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[29]  Palaiahnakote Shivakumara,et al.  Text Detection Using Delaunay Triangulation in Video Sequence , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[30]  Umapada Pal,et al.  Recent Advances in Video Based Document Processing: A Review , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[31]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..

[32]  Palaiahnakote Shivakumara,et al.  A novel ring radius transform for video character reconstruction , 2013, Pattern Recognit..

[33]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[34]  Shijian Lu,et al.  Gradient Vector Flow and Grouping-Based Method for Arbitrarily Oriented Scene Text Detection in Video Images , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[36]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[37]  Rongrong Wang,et al.  A novel video caption detection approach using multi-frame integration , 2004, ICPR 2004.

[38]  Jing Zhang,et al.  Extraction of Text Objects in Video Documents: Recent Progress , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[39]  C Tomasi,et al.  Shape and motion from image streams: a factorization method. , 1992, Proceedings of the National Academy of Sciences of the United States of America.