Learning Discriminative Hierarchical Features for Object Recognition

Hierarchical feature learning methods have demonstrated substantial improvements over the conventional hand-designed local features. However, recent approaches mainly perform feature learning in an unsupervised manner, where subtle differences between different classes can hardly be captured. In this letter, we propose a discriminative hierarchical feature learning method, which learns a non-linear transformation to encode discriminative information in the feature space. We apply our features on two general image classification benchmarks: Caltech 101, STL-10, and a new fine-grained image classification dataset: NTU Tree-51. The results show that by employing discriminative constraint, our method consistently improves the performance with 3% to 7% in classification accuracy.

[1]  Shenghuo Zhu,et al.  Deep Learning of Invariant Features via Simulated Fixations in Video , 2012, NIPS.

[2]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[3]  David G. Lowe,et al.  Local Naive Bayes Nearest Neighbor for image classification , 2011, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[7]  Aapo Hyvärinen,et al.  Natural Image Statistics - A Probabilistic Approach to Early Computational Vision , 2009, Computational Imaging and Vision.

[8]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[9]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[11]  Alfred O. Hero,et al.  Efficient learning of sparse, distributed, convolutional feature representations for object recognition , 2011, 2011 International Conference on Computer Vision.

[12]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13]  Pedro M. Domingos,et al.  Discriminative Learning of Sum-Product Networks , 2012, NIPS.

[14]  John D. Lafferty,et al.  Learning image representations from the pixel level via hierarchical sparse coding , 2011, CVPR 2011.

[15]  Quoc V. Le,et al.  Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis , 2011, CVPR 2011.

[16]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[18]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.