Rectification and Super-Resolution Enhancements for Forensic Text Recognition †

Retrieving text embedded within images is a challenging task in real-world settings. Multiple problems such as low-resolution and the orientation of the text can hinder the extraction of information. These problems are common in environments such as Tor Darknet and Child Sexual Abuse images, where text extraction is crucial in the prevention of illegal activities. In this work, we evaluate eight text recognizers and, to increase the performance of text transcription, we combine these recognizers with rectification networks and super-resolution algorithms. We test our approach on four state-of-the-art and two custom datasets (TOICO-1K and Child Sexual Abuse (CSA)-text, based on text retrieved from Tor Darknet and Child Sexual Exploitation Material, respectively). We obtained a 0.3170 score of correctly recognized words in the TOICO-1K dataset when we combined Deep Convolutional Neural Networks (CNN) and rectification-based recognizers. For the CSA-text dataset, applying resolution enhancements achieved a final score of 0.6960. The highest performance increase was achieved on the ICDAR 2015 dataset, with an improvement of 4.83% when combining the MORAN recognizer and the Residual Dense resolution approach. We conclude that rectification outperforms super-resolution when applied separately, while their combination achieves the best average improvements in the chosen datasets.

[1]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Min Yang,et al.  Investigating Capsule Networks with Dynamic Routing for Text Classification , 2018, EMNLP.

[4]  Scott W. Duxbury,et al.  The Network Structure of Opioid Distribution on a Darknet Cryptomarket , 2017, Journal of Quantitative Criminology.

[5]  Ernest Valveny,et al.  ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[6]  Eduardo Fidalgo,et al.  Object Detection for Crime Scene Evidence Analysis Using Deep Learning , 2017, ICIAP.

[7]  Eduardo Fidalgo,et al.  SummCoder: An unsupervised framework for extractive text summarization based on deep auto-encoders , 2019, Expert Syst. Appl..

[8]  Heike Hofmann,et al.  Machine learning in forensic applications , 2019, Significance.

[9]  Parag H Rughani,et al.  MACHINE LEARNING FORENSICS:A NEW BRANCH OF DIGITAL FORENSICS , 2017 .

[10]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[11]  Nhien-An Le-Khac,et al.  Improving Borderline Adulthood Facial Age Estimation through Ensemble Learning , 2019, ARES.

[12]  Gaofeng Meng,et al.  Scene text detection and recognition with advances in deep learning: a survey , 2019, International Journal on Document Analysis and Recognition (IJDAR).

[13]  Shilpi Singh,et al.  Techniques and Challenges of Face Recognition: A Critical Review , 2018 .

[14]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[15]  Mark S. Nixon,et al.  Feature extraction & image processing for computer vision , 2012 .

[16]  Junjie Yan,et al.  FOTS: Fast Oriented Text Spotting with a Unified Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Fatih Kurugollu,et al.  A Survey of Deep Learning Solutions for Multimedia Visual Content Analysis , 2019, IEEE Access.

[18]  Xiang Bai,et al.  ASTER: An Attentional Scene Text Recognizer with Flexible Rectification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Xiang Bai,et al.  Scene text detection and recognition: recent advances and future trends , 2015, Frontiers of Computer Science.

[20]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Christophe Garcia,et al.  ICDAR2015 competition on Text Image Super-Resolution , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[22]  Eduardo Fidalgo,et al.  Detecting textual information in images from onion domains using text spotting , 2020 .

[23]  Erik Cambria,et al.  Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM , 2018, AAAI.

[24]  Xiang Bai,et al.  An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26]  Kai Wang,et al.  Word Spotting in the Wild , 2010, ECCV.

[27]  Eduardo Fidalgo,et al.  Pornography and child sexual abuse detection in image and video: A comparative evaluation , 2017, ICDP.

[28]  David S. Doermann,et al.  Text Detection and Recognition in Imagery: A Survey , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Seong Joon Oh,et al.  What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Eduardo Fidalgo,et al.  ToRank: Identifying the most influential suspicious domains in the Tor network , 2019, Expert Syst. Appl..

[31]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Eduardo Fidalgo,et al.  Recognition of service domains on TOR dark net using perceptual hashing and image classification techniques , 2017, ICDP.

[33]  Lianwen Jin,et al.  A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[34]  Eduardo Fidalgo,et al.  Classifying suspicious content in tor darknet through Semantic Attention Keypoint Filtering , 2019, Digit. Investig..

[35]  Gang Wang,et al.  You Are Your Photographs: Detecting Multiple Identities of Vendors in the Darknet Marketplaces , 2018, AsiaCCS.

[36]  Eduardo Fidalgo,et al.  Query Based Object Retrieval Using Neural Codes , 2017, SOCO-CISIS-ICEUTE.

[37]  Shuigeng Zhou,et al.  AON: Towards Arbitrarily-Oriented Text Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Takio Kurita,et al.  Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network , 2017, ICONIP.

[39]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[40]  Enrique Alegre,et al.  Enhancing text recognition on Tor Darknet images , 2020 .

[41]  Yun Fu,et al.  Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Mingzhe Li,et al.  Classification of Illegal Activities on the Dark Web , 2019, Proceedings of the 2019 2nd International Conference on Information Science and Systems.

[43]  William W. Cohen,et al.  A Comparison of String Metrics for Matching Names and Records , 2003 .