Discriminative dictionary learning via shared latent structure for object recognition and activity recognition

We propose a novel low-dimensional discriminative dictionary learning approach for multi-class classification tasks, Latent Structure based Discriminative Dictionary Learning (LS-DDL). Our approach first projects features and class labels onto a shared latent structure space, and then generates a discriminative and low-dimensional input to a discriminative dictionary learning framework. LS-DDL learns a more discriminative and lower-dimensional dictionary than existing dictionary learning methods. Therefore we obtain high recognition accuracy with a small number of low-dimensional dictionary atoms. The low dimensionality also improves the efficiency in storage and testing. In addition, the latent structure projection eliminates the classifier weighting parameter in existing discriminative dictionary learning approaches. We validate the effectiveness and efficiency of the proposed approach through a series of experiments on image-based face recognition and video-based activity recognition. Our results show that the proposed approach obtains much higher recognition accuracy with a small number of dictionary atoms, and costs much less computational time than state-of-the-art methods.

[1]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[2]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Larry S. Davis,et al.  Discriminative Dictionary Learning with Pairwise Constraints , 2012, ACCV.

[4]  Baoxin Li,et al.  Discriminative K-SVD for dictionary learning in face recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[6]  Wynne W. Chin,et al.  Handbook of Partial Least Squares , 2010 .

[7]  Svetha Venkatesh,et al.  Joint learning and dictionary construction for pattern recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Jörg Henseler,et al.  Handbook of Partial Least Squares: Concepts, Methods and Applications , 2010 .

[9]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[11]  Pierre Vandergheynst,et al.  A low complexity Orthogonal Matching Pursuit for sparse signal approximation with shift-invariant dictionaries , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Rama Chellappa,et al.  Sparse Embedding: A Framework for Sparsity Promoting Dimensionality Reduction , 2012, ECCV.

[13]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Gregory D. Hager,et al.  Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions , 2009, CVPR.

[15]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[16]  Guillermo Sapiro,et al.  Discriminative learned dictionaries for local image analysis , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Mubarak Shah,et al.  Action MACH a spatio-temporal Maximum Average Correlation Height filter for action recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Guillermo Sapiro,et al.  Supervised Dictionary Learning , 2008, NIPS.

[19]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[20]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Rong Jin,et al.  Unifying discriminative visual codebook generation with classifier training for object category recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  S. Wold,et al.  PLS-regression: a basic tool of chemometrics , 2001 .

[24]  Rama Chellappa,et al.  Sparse dictionary-based representation and recognition of action attributes , 2011, 2011 International Conference on Computer Vision.

[25]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[26]  Ke Huang,et al.  Sparse Representation for Signal Classification , 2006, NIPS.

[27]  David Zhang,et al.  Fisher Discrimination Dictionary Learning for sparse representation , 2011, 2011 International Conference on Computer Vision.