论文信息 - Text Detection and Recognition in the Wild: A Review

Text Detection and Recognition in the Wild: A Review

Detection and recognition of text in natural images are two main problems in the field of computer vision that have a wide variety of applications in analysis of sports videos, autonomous driving, industrial automation, to name a few. They face common challenging problems that are factors in how text is represented and affected by several environmental conditions. The current state-of-the-art scene text detection and/or recognition methods have exploited the witnessed advancement in deep learning architectures and reported a superior accuracy on benchmark datasets when tackling multi-resolution and multi-oriented text. However, there are still several remaining challenges affecting text in the wild images that cause existing methods to underperform due to there models are not able to generalize to unseen data and the insufficient labeled data. Thus, unlike previous surveys in this field, the objectives of this survey are as follows: first, offering the reader not only a review on the recent advancement in scene text detection and recognition, but also presenting the results of conducting extensive experiments using a unified evaluation framework that assesses pre-trained models of the selected methods on challenging cases, and applies the same evaluation criteria on these techniques. Second, identifying several existing challenges for detecting or recognizing text in the wild images, namely, in-plane-rotation, multi-oriented and multi-resolution text, perspective distortion, illumination reflection, partial occlusion, complex fonts, and special characters. Finally, the paper also presents insight into the potential research directions in this field to address some of the mentioned challenges that are still encountering scene text detection and recognition techniques.

[1] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2] Jaakko Lehtinen,et al. Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Xiaogang Wang,et al. Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Marcelo H. Ang,et al. Car detection for autonomous vehicle: LIDAR and vision fusion approach through deep learning framework , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5] Yongdong Zhang,et al. A Novel Image Text Extraction Method Based on K-Means Clustering , 2008, Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008).

[6] Xiangjian He,et al. FACLSTM: ConvLSTM with focused attention for scene text recognition , 2019, Science China Information Sciences.

[7] Xiaogang Wang,et al. Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Gang Yu,et al. Scene Text Detection with Supervised Pyramid Context Network , 2018, AAAI.

[9] Albert Gordo,et al. Rosetta: Large Scale System for Text Detection and Recognition in Images , 2018, KDD.

[10] Wenyu Liu,et al. TextBoxes: A Fast Text Detector with a Single Deep Neural Network , 2016, AAAI.

[11] Lionel Prevost,et al. 2009 10th International Conference on Document Analysis and Recognition Text Detection and Localization in Complex Scene Images using Constrained AdaBoost Algorithm , 2022 .

[12] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Datong Chen,et al. Text detection and recognition in images and video sequences , 2003 .

[14] Lianwen Jin,et al. Aggregation Cross-Entropy for Sequence Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Daniel P. Lopresti,et al. Extracting text from WWW images , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[16] Tao Wang,et al. End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[17] Fei Yin,et al. Deep Direct Regression for Multi-oriented Scene Text Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18] Shuigeng Zhou,et al. AON: Towards Arbitrarily-Oriented Text Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19] Yeongjae Cheon,et al. PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection , 2016, ArXiv.

[20] Gaofeng Meng,et al. Scene text detection and recognition with advances in deep learning: a survey , 2019, International Journal on Document Analysis and Recognition (IJDAR).

[21] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[22] Nafiz Arica,et al. An overview of character recognition focused on off-line handwriting , 2001, IEEE Trans. Syst. Man Cybern. Syst..

[23] Andrew Zisserman,et al. Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24] Weilin Huang,et al. Text Localization in Natural Images Using Stroke Feature Transform and Text Covariance Descriptors , 2013, 2013 IEEE International Conference on Computer Vision.

[25] Andrew Y. Ng,et al. Autonomous sign reading for semantic mapping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[26] Lei Sun,et al. Mask R-CNN With Pyramid Attention Network for Scene Text Detection , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[27] Yanning Zhang,et al. A Simple and Robust Convolutional-Attention Network for Irregular Text Recognition , 2019, ArXiv.

[28] Yuxiao Hu,et al. Text From Corners: A Novel Approach to Detect Text and Caption in Videos , 2011, IEEE Transactions on Image Processing.

[29] Shijian Lu,et al. ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Pan He,et al. Detecting Text in Natural Image with Connectionist Text Proposal Network , 2016, ECCV.

[31] Peng Wang,et al. Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition , 2018, AAAI.

[32] Yung-Yu Chuang,et al. Learning to See Through Obstructions , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Lianwen Jin,et al. A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[34] Kaizhu Huang,et al. Robust Text Detection in Natural Scene Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35] Hiroshi Murase,et al. Automatic acquisition of context-based images templates for degraded character recognition in scene images , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[36] José A. Rodríguez-Serrano,et al. Label embedding for text recognition , 2013, BMVC.

[37] Shuchang Zhou,et al. EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[39] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Xuelong Li,et al. PixelLink: Detecting Scene Text via Instance Segmentation , 2018, AAAI.

[41] Bernard Gosselin,et al. Color text extraction with selective metric-based clustering , 2007, Comput. Vis. Image Underst..

[42] C. V. Jawahar,et al. Top-down and bottom-up cues for scene text recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43] Alan L. Yuille,et al. Detecting and reading text in natural scenes , 2004, CVPR 2004.

[44] Errui Ding,et al. Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45] Shijian Lu,et al. Accurate Scene Text Recognition Based on Recurrent Neural Network , 2014, ACCV.

[46] Stefano Messelodi,et al. Scene text recognition and tracking to identify athletes in sport videos , 2011, Multimedia Tools and Applications.

[47] Huizhong Chen,et al. Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions , 2011, 2011 18th IEEE International Conference on Image Processing.

[48] Jean-Michel Jolion,et al. Object count/area graphs for the evaluation of object detection and segmentation algorithms , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[49] Naif Alajlan,et al. Deep Learning Approach for Car Detection in UAV Imagery , 2017, Remote. Sens..

[50] Chris McCarthy,et al. A Model for Automatic Recognition of Vertical Texts in Natural Scene Images , 2018, 2018 8th IEEE International Conference on Control System, Computing and Engineering (ICCSCE).

[51] Xiang Bai,et al. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[53] Wei Liu,et al. Char-Net: A Character-Aware Neural Network for Distorted Scene Text Recognition , 2018, AAAI.

[54] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[57] Palaiahnakote Shivakumara,et al. A robust arbitrary text detection system for natural scene images , 2014, Expert Syst. Appl..

[58] Manik Varma,et al. Character Recognition in Natural Images , 2009, VISAPP.

[59] Nick Barnes,et al. Real Image Denoising With Feature Attention , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[60] Lianwen Jin,et al. Omnidirectional Scene Text Detection with Sequential-free Box Discretization , 2019, IJCAI.

[61] Tong Zhang,et al. Mobile Camera Based Text Detection and Translation , 2011 .

[62] Andreas Dengel,et al. ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[63] Xiangyang Xue,et al. Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[64] Yibo Liu,et al. 2D-CTC for Scene Text Recognition , 2019, ArXiv.

[65] Ankush Gupta,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66] Kai Wang,et al. Word Spotting in the Wild , 2010, ECCV.

[67] Junsu Lee,et al. Simultaneous Recognition of Horizontal and Vertical Text in Natural Images , 2018, ACCV Workshops.

[68] Jiri Matas,et al. Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[69] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[70] Junjie Yan,et al. FOTS: Fast Oriented Text Spotting with a Unified Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[71] Weilin Huang,et al. Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees , 2014, ECCV.

[72] Yang Liu,et al. Synthetically Supervised Feature Learning for Scene Text Recognition , 2018, ECCV.

[73] Min Zhang,et al. An Algorithm for Scene Text Detection Using Multibox and Semantic Segmentation , 2019, Applied Sciences.

[74] Jiri Matas,et al. A Method for Text Localization and Recognition in Real-World Images , 2010, ACCV.

[75] Gui-Song Xia,et al. Rotation-Sensitive Regression for Oriented Scene Text Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[76] Xiang Bai,et al. Scene text detection and recognition: recent advances and future trends , 2015, Frontiers of Computer Science.

[77] Shuicheng Yan,et al. Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[78] Antonios Gasteratos,et al. Semantic mapping for mobile robotics tasks: A survey , 2015, Robotics Auton. Syst..

[79] Zhuowen Tu,et al. Detecting Texts of Arbitrary Orientations in 1 Natural Images , 2012 .

[80] Cheng-Lin Liu,et al. A Hybrid Approach to Detect and Localize Texts in Natural Scene Images , 2011, IEEE Transactions on Image Processing.

[81] Wei Li,et al. R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection , 2017, ArXiv.

[82] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[83] Ravindra Bandal,et al. Mobile Camera Based Text Detection and Translation , 2014 .

[84] N. Altman. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[85] Xilin Chen,et al. Detection of text on road signs from video , 2005, IEEE Trans. Intell. Transp. Syst..

[86] Xin He,et al. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes , 2018, ECCV.

[87] Wen Gao,et al. Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[88] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[89] Lianwen Jin,et al. ICDAR 2019 Competition on Large-Scale Street View Text with Partial Labeling - RRC-LSVT , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[90] Xiaolin Li,et al. Single Shot Text Detector with Regional Attention , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[91] Jie Sheng,et al. Pyramid Mask Text Detector , 2019, ArXiv.

[92] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[93] Xiang Bai,et al. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[94] Han Hu,et al. WordSup: Exploiting Word Annotations for Character Based Text Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[95] Jiri Matas,et al. COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images , 2016, ArXiv.

[96] David S. Doermann,et al. Text Detection and Recognition in Imagery: A Survey , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[97] Chucai Yi,et al. Text String Detection From Natural Scenes by Structure-Based Partition and Grouping , 2011, IEEE Transactions on Image Processing.

[98] Yonatan Wexler,et al. Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[99] Seong Joon Oh,et al. What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[100] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[101] Wonjun Kim,et al. A New Approach for Overlay Text Detection and Extraction From Complex Video Scene , 2009, IEEE Transactions on Image Processing.

[102] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[103] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[104] Zihan Zhou,et al. Learning to Read Irregular Text with Attention Mechanisms , 2017, IJCAI.

[105] Huizhong Chen,et al. Mobile visual search on printed documents using text and low bit-rate features , 2011, 2011 18th IEEE International Conference on Image Processing.

[106] Christof Koch,et al. AdaBoost for Text Detection in Natural Scene , 2011, 2011 International Conference on Document Analysis and Recognition.

[107] S.M. Lucas,et al. ICDAR 2005 text locating competition results , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[108] Xiang Bai,et al. Symmetry-based text line detection in natural scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[109] Chee Seng Chan,et al. Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[110] Alexander M. Rush,et al. Image-to-Markup Generation with Coarse-to-Fine Attention , 2016, ICML.

[111] Xindong Wu,et al. Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[112] Wei Zhou,et al. TextField: Learning a Deep Direction Field for Irregular Scene Text Detection , 2018, IEEE Transactions on Image Processing.

[113] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[114] Han Lin,et al. Review of Scene Text Detection and Recognition , 2020, Archives of Computational Methods in Engineering.

[115] Pietro Perona,et al. Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[116] Jiri Matas,et al. FASText: Efficient Unconstrained Scene Text Detector , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[117] Johan A. K. Suykens,et al. Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[118] Kongqiao Wang,et al. Character location in scene images from digital camera , 2003, Pattern Recognit..

[119] Changming Sun,et al. An End-to-End TextSpotter with Explicit Alignment and Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[120] Dit-Yan Yeung,et al. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[121] Xu-Cheng Yin,et al. Text Detection, Tracking and Recognition in Video: A Comprehensive Survey , 2016, IEEE Transactions on Image Processing.

[122] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[123] Geoffrey E. Hinton,et al. Dynamic Routing Between Capsules , 2017, NIPS.

[124] Jon Almazán,et al. ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[125] Shuigeng Zhou,et al. Edit Probability for Scene Text Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[126] Jian Zhang,et al. Scene Text Recognition from Two-Dimensional Perspective , 2018, AAAI.

[127] Jin Hyung Kim,et al. Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[128] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[129] Tong Lu,et al. Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[130] Shuigeng Zhou,et al. Focusing Attention: Towards Accurate Text Recognition in Natural Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[131] Ernest Valveny,et al. ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[132] Lianwen Jin,et al. Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[133] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[134] Shijian Lu,et al. Text Flow: A Unified Text Detection System in Natural Scene Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[135] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[136] Pietro Perona,et al. Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[137] Xiang Bai,et al. Detecting Oriented Text in Natural Images by Linking Segments , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[138] Huizhong Chen,et al. The stanford mobile visual search data set , 2011, MMSys.

[139] Andrew Zisserman,et al. Deep Structured Output Learning for Unconstrained Text Recognition , 2014, ICLR.

[140] Huimin Ma,et al. 3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[141] Xiaolin Hu,et al. Gated Recurrent Convolution Neural Network for OCR , 2017, NIPS.

[142] Pan He,et al. Reading Scene Text in Deep Convolutional Sequences , 2015, AAAI.

[143] Lianwen Jin,et al. Character proposal network for robust text extraction , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[144] Albert Gordo,et al. Supervised mid-level features for word image representation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[145] Luc Van Gool,et al. Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[146] Qiang Qiu,et al. Oriented Response Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[147] Jiřı́ Matas,et al. Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[148] Sen Yan,et al. Image super-resolution reconstruction based on attention mechanism and feature fusion , 2020, ArXiv.

[149] Hartmut Neven,et al. PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[150] Vincent Lepetit,et al. Fast Keypoint Recognition in Ten Lines of Code , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[151] Zheng Huang,et al. ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[152] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[153] Xiang Li,et al. Shape Robust Text Detection With Progressive Scale Expansion Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[154] Jiri Matas,et al. Scene Text Localization and Recognition with Oriented Stroke Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[155] Kevin Murphy,et al. Attention-Based Extraction of Structured Information from Street View Imagery , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[156] Xin He,et al. Scene Text Detection and Recognition: The Deep Learning Era , 2018, International Journal of Computer Vision.

[157] Ernest Valveny,et al. Word Spotting and Recognition with Embedded Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[158] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[159] Naoyuki Morimoto,et al. ICDAR2017 Robust Reading Challenge on Omnidirectional Video , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[160] Alexander M. Rush,et al. What You Get Is What You See: A Visual Markup Decompiler , 2016, ArXiv.

[161] Xiang Bai,et al. Robust Scene Text Recognition with Automatic Rectification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[162] Shuchang Zhou,et al. Scene Text Detection via Holistic, Multi-Channel Prediction , 2016, ArXiv.

[163] Yue Wu,et al. Self-Organized Text Detection with Minimal Post-processing via Border Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[164] Ling Shao,et al. NLH: A Blind Pixel-Level Non-Local Method for Real-World Image Denoising , 2019, IEEE Transactions on Image Processing.

[165] Hojin Cho,et al. Canny Text Detector: Fast and Robust Scene Text Localization Algorithm , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[166] Dorothea Blostein,et al. Handbook of Character Recognition and Document Image Analysis , 1997 .

[167] Jiri Matas,et al. Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[168] Palaiahnakote Shivakumara,et al. 2009 10th International Conference on Document Analysis and Recognition A Gradient Difference based Technique for Video Text Detection , 2022 .

[169] Wei Liu,et al. STAR-Net: A SpaTial Attention Residue Network for Scene Text Recognition , 2016, BMVC.

[170] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[171] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[172] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[173] Dongyoon Han,et al. Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[174] Simon M. Lucas,et al. ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[175] Soumya K. Ghosh,et al. Optical Character Recognition Systems for Different Languages with Soft Computing , 2016, Studies in Fuzziness and Soft Computing.

[176] Palaiahnakote Shivakumara,et al. Recognizing Text with Perspective Distortion in Natural Scenes , 2013, 2013 IEEE International Conference on Computer Vision.

[177] Lluis Gomez,et al. Selective Style Transfer for Text , 2019, 2019 International Conference on Document Analysis and Recognition (ICDAR).

[178] Yi-Chao Wu,et al. Scene Text Recognition with Sliding Convolutional Character Models , 2017, ArXiv.

[179] Cheng-Lin Liu,et al. Text Localization in Natural Scene Images Based on Conditional Random Field , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[180] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[181] Lianwen Jin,et al. Detecting Curve Text in the Wild: New Dataset and New Solution , 2017, ArXiv.

[182] Xiang Bai,et al. TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[183] Bernard Gosselin,et al. Spatial and Color Spaces Combination for Natural Scene Text Extraction , 2006, 2006 International Conference on Image Processing.

[184] Simon Osindero,et al. Recursive Recurrent Nets with Attention Modeling for OCR in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[185] Rae-Hong Park,et al. Recognition of raised characters for automatic classification of rubber tires , 1995 .

[186] Chunheng Wang,et al. An adaptive text detection approach in images and video frames , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[187] C. V. Jawahar,et al. Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.