Automatic script identification in the wild

With the rapid increase of transnational communication and cooperation, people frequently encounter multilingual scenarios in various situations. In this paper, we are concerned with a relatively new problem: script identification at word or line levels in natural scenes. A large-scale dataset with a great quantity of natural images and 10 types of widely-used languages is constructed and released. In allusion to the challenges in script identification in real-world scenarios, a deep learning based algorithm is proposed. The experiments on the proposed dataset demonstrate that our algorithm achieves superior performance, compared with conventional image classification or script identification methods, including as the original CNN architecture, LLC and GLCM.

[1]  Jiri Matas,et al.  A Method for Text Localization and Recognition in Real-World Images , 2010, ACCV.

[2]  Dimosthenis Karatzas,et al.  Multi-script Text Extraction from Natural Scenes , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[3]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[6]  Xiang Bai,et al.  Symmetry-based text line detection in natural scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Song Bai,et al.  Multiple Stage Residual Model for Accurate Image Classification , 2014, ACCV.

[9]  Wenyu Liu,et al.  A Unified Framework for Multi-Oriented Text Detection and Recognition , 2014 .

[10]  Longin Jan Latecki,et al.  3D Shape Matching via Two Layer Coding , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Renu Dhir,et al.  Script Identification of Pre-segmented Multi-font Characters and Digits , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Wenyu Liu,et al.  Bag of contour fragments for robust shape classification , 2014, Pattern Recognit..

[15]  Tieniu Tan,et al.  Rotation Invariant Texture Features and Their Use in Automatic Script Identification , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Wenyu Liu,et al.  Strokelets: A Learned Multi-scale Representation for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Shijian Lu,et al.  New Spatial-Gradient-Features for Video Script Identification , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[18]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[19]  Wenyu Liu,et al.  A Unified Framework for Multioriented Text Detection and Recognition , 2014, IEEE Transactions on Image Processing.

[20]  U. Pal,et al.  Multi-script line identification from Indian documents , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[21]  Bidyut Baran Chaudhuri,et al.  Word-Wise Script Identification from Indian Documents , 2004, Document Analysis Systems.

[22]  Sridha Sridharan,et al.  Texture for script identification , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Patrick Kelly,et al.  Automatic Script Identification From Document Images Using Cluster-Based Templates , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Jayanthi Sivaswamy,et al.  Script Identification from Indian Documents , 2006, Document Analysis Systems.

[25]  Kai Wang,et al.  Word Spotting in the Wild , 2010, ECCV.

[26]  Shijian Lu,et al.  Video Script Identification Based on Text Lines , 2011, 2011 International Conference on Document Analysis and Recognition.

[27]  Andreas Dengel,et al.  ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[28]  Debashis Ghosh,et al.  Script Recognition—A Review , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Xiang Bai,et al.  Scene text detection and recognition: recent advances and future trends , 2015, Frontiers of Computer Science.

[31]  Zhuowen Tu,et al.  Detecting Texts of Arbitrary Orientations in 1 Natural Images , 2012 .

[32]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[33]  Jayanthi Sivaswamy,et al.  A generalised framework for script identification , 2007, International Journal of Document Analysis and Recognition (IJDAR).