Deepdocclassifier: Document classification with deep Convolutional Neural Network

This paper presents a deep Convolutional Neural Network (CNN) based approach for document image classification. One of the main requirement of deep CNN architecture is that they need huge number of samples for training. To overcome this problem we adopt a deep CNN which is trained using big image dataset containing millions of samples i.e., ImageNet. The proposed work outperforms both the traditional structure similarity methods and the CNN based approaches proposed earlier. The accuracy of the proposed approach with merely 20 images per class outperforms the state-of-the-art by achieving classification accuracy of 68.25%. The best results on Tobbacoo-3428 dataset show that our proposed method outperforms the state-of-the-art method by a significant margin and achieved a median accuracy of 77.6% with 100 samples per class used for training and validation.

[1]  Andrew W. Fitzgibbon,et al.  Efficient Object Category Recognition Using Classemes , 2010, ECCV.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Shlomo Argamon,et al.  Building a test collection for complex document information processing , 2006, SIGIR.

[4]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[5]  Venu Govindaraju,et al.  Form classification , 2008, DRR.

[6]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[7]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Yi Li,et al.  Convolutional Neural Networks for Document Image Classification , 2014, 2014 22nd International Conference on Pattern Recognition.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  David S. Doermann,et al.  Document Image Retrieval Based on Layout Structural Similarity , 2006, IPCV.

[11]  Yillbyung Lee,et al.  Form classification using DP matching , 2000, SAC '00.

[12]  Jayant Kumar,et al.  Structural similarity for document image classification and retrieval , 2014, Pattern Recognit. Lett..

[13]  David S. Doermann,et al.  Learning document structure for retrieval and classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[14]  Andreas Dengel,et al.  Clustering and classification of document structure-a machine learning approach , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[15]  Siyuan Chen,et al.  Structured document classification by matching local salient features , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[16]  Kevyn Collins-Thompson A Clustering-Based Algorithm for Automatic Document Separation , 2002 .

[17]  Véronique Eglin,et al.  Curvelets Based Queries for CBIR Application in Handwriting Collections , 2007 .

[18]  Takashi Saitoh,et al.  User-defined template for identifying document type and extracting information from documents , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[19]  Marcel Worring,et al.  Content-free document genre classification using first order random graphs , 2001 .

[20]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.