An Automatic Reader of Identity Documents

Identity documents automatic reading and verification is an appealing technology for nowadays service industry, since this task is still mostly performed manually, leading to waste of economic and time resources. In this paper the prototype of a novel automatic reading system of identity documents is presented. The system has been thought to extract data of the main Italian identity documents from photographs of acceptable quality, like those usually required to online subscribers of various services. The document is first localized inside the photo, and then classified; finally, text recognition is executed. A synthetic dataset has been used, both for neural networks training, and for performance evaluation of the system. The synthetic dataset avoided privacy issues linked to the use of real photos of real documents, which will be used, instead, for future developments of the system.

[1]  Khaled F. Hussain,et al.  Accurate, Data-Efficient, Unconstrained Text Recognition with Convolutional Neural Networks , 2018, Pattern Recognit..

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Thierry Géraud,et al.  Saliency-Based Detection of Identy Documents Captured by Smartphones , 2018, 2018 13th IAPR International Workshop on Document Analysis Systems (DAS).

[4]  Xing Wu,et al.  A System to Localize and Recognize Texts in Oriented ID Card Images , 2018, 2018 IEEE International Conference on Progress in Informatics and Computing (PIC).

[5]  Ronan Sicre,et al.  Complex Document Classification and Localization Application on Identity Document Images , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[6]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[7]  Joachim Denzler,et al.  Fine-grained classification of identity document types with only one example , 2015, 2015 14th IAPR International Conference on Machine Vision Applications (MVA).

[8]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[9]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[10]  Josep Lladós,et al.  Use case visual Bag-of-Words techniques for camera based identity document classification , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[11]  Dmitry P. Nikolaev,et al.  Smart IDReader: Document Recognition in Video Stream , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[12]  Xin Xu,et al.  ID card identification system based on image recognition , 2017, 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA).