Image Classification via fusing the latent deep CNN feature

In recent years, the convolutional neural network (CNN) has made great achievements in image classification. It can extract features of image and classify them from a large number of image data automatically. Compared with these traditional feature extraction techniques (e.g., SIFT, HOG, GIST), the convolutional neural network can make better performance and does not need hand designed image features. However, how to further enhance the algorithm performance is still a hot spot in academic research. Therefore, in this paper, we propose a method to fuse the latent features extracted from the middle layers in a CNN to train a more robust classifier. First, we utilize the pretrained CNN models by caffe to extract visual features of 3-layer.Then, we use the SVM classifier for 3-layer features respectively, and we get their trained classifier. Finally, we combine three pretrained classifier into a classifier which is compared with 3-layer SVM classifier. The experiment is performed on Caltech-256 datasets. The experimental result shows that the combined classifier obtains good performance compared with the conventional CNN.

[1]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Lawrence D. Jackel,et al.  Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.

[3]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[4]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[10]  Yutaka Satoh,et al.  Feature Evaluation of Deep Convolutional Neural Networks for Object Recognition and Detection , 2015, ArXiv.

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Zheng Cao,et al.  Marine animal classification using combined CNN and hand-designed image features , 2015, OCEANS 2015 - MTS/IEEE Washington.

[13]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[14]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[15]  John R. Anderson,et al.  Explorations of an Incremental, Bayesian Algorithm for Categorization , 1992, Machine Learning.

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).