Reconnaissance et extraction de documents. Une application industrielle à la détection de documents semi-structurés

RÉSUMÉ. Cet article aborde le problème de la reconnaissance d’images de documents semistructurés. L’objectif est de détecter la présence d’un document dans une image et d’extraire la zone d’intérêt qui le contient. Dans un premier temps, un exemple de document à retrouver est donné en entrée du système et un ensemble de points d’intérêt sont extraits de cette image requête. Ensuite, pour chaque image à comparer, l’ensemble des points d’intérêt sont extraits puis mis en correspondance avec ceux de l’image requête. Cette étape de mise en correspondance permet de calculer la transformation géométrique (translation, rotation, zoom) permettant de localiser précisément l’image requête dans les images à analyser. Deux principales propositions sont faites pour rendre utilisable cette techniques pour la recherche d’image de documents : la sélection de points d’intérêt et l’adaptation de RANSAC.

[1]  Torsten Sattler,et al.  SCRAMSAC: Improving RANSAC's efficiency with a spatial consistency filter , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[3]  Luo Juan,et al.  A comparison of SIFT, PCA-SIFT and SURF , 2009 .

[4]  David S. Doermann,et al.  Logo Retrieval in Document Images , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[5]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[6]  Tuan D. Pham Unconstrained logo detection in document images , 2003, Pattern Recognit..

[7]  Weon-Geun Oh,et al.  An analysis of the effect of different image preprocessing techniques on the performance of SURF: Speeded Up Robust Features , 2011, 2011 17th Korea-Japan Joint Workshop on Frontiers of Computer Vision (FCV).

[8]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[9]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[11]  David S. Doermann,et al.  Logo Matching for Document Image Retrieval , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[12]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[14]  Naeem Bhatti,et al.  Local Primitive Histograms for Patent Binary Image Retrieval , 2011 .

[15]  Chia-Ling Tsai,et al.  Alignment of challenging image pairs: Refinement and region growing starting from a single keypoint correspondence , 2005 .

[16]  Eleftherios Kayafas,et al.  Vehicle Logo Recognition Using a SIFT-Based Enhanced Matching Scheme , 2010, IEEE Transactions on Intelligent Transportation Systems.

[17]  Jean-Philippe Domenger,et al.  Reconnaissance et Extraction de Pièces d'identité , 2012 .

[18]  Marie-Odile Berger,et al.  Image point correspondences and repeated patterns , 2011 .

[19]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[20]  Dorothea Blostein,et al.  A survey of document image classification: problem statement, classifier architecture and performance evaluation , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[21]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[22]  Josep Lladós,et al.  Logo Spotting by a Bag-of-words Approach for Document Categorization , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[23]  Jiri Matas,et al.  Locally Optimized RANSAC , 2003, DAGM-Symposium.

[24]  Jean-Philippe Domenger,et al.  Document Images Indexing with Relevance Feedback: An Application to Industrial Context , 2011, 2011 International Conference on Document Analysis and Recognition.

[25]  Jan-Michael Frahm,et al.  A Comparative Analysis of RANSAC Techniques Leading to Adaptive Real-Time Random Sample Consensus , 2008, ECCV.

[26]  Hideo Saito,et al.  Augmenting text document by on-line learning of local arrangement of keypoints , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[27]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[28]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[29]  Thomas M. Breuel,et al.  On the Use of Geometric Matching for Both: Isolated Symbol Recognition and Symbol Spotting , 2011, GREC.

[30]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[31]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[32]  Jean-Yves Ramel,et al.  A Robust Approach for Local Interest Point Detection in Line-Drawing Images , 2012, 2012 10th IAPR International Workshop on Document Analysis Systems.

[33]  Masakazu Iwamura,et al.  Real-Time Document Image Retrieval for a 10 Million Pages Database with a Memory Efficient and Stability Improved LLAH , 2011, 2011 International Conference on Document Analysis and Recognition.