Self-paced dictionary learning for image classification

Image classification is an important research task in multimedia content analysis and processing. Learning a compact dictionary easying to derive sparse representation is one of the focused issues in the state-of-the-art image classification framework. Most existing dictionary learning approaches assign equal importance to all training samples, which in fact have different complexity in terms of sparse representation. Meanwhile, the contextual information "hidden" in different samples is ignored as well. In this paper, we propose a self-paced dictionary learning algorithm in order to accommodate the "hidden" information of the samples into the learning procedure, which uses the easy samples to train the dictionary first, and then iteratively introduces more complex samples in the remaining training procedure until the entire training data are all easy samples. The algorithm adaptively chooses the easy samples in each iteration, while the learned dictionary in the previous iteration is in turn used as a basis for the current iteration. This strategy implicitly takes advantage of the contextual relationships among training samples. The number of the chosen samples in each iteration is determined by an adaptive threshold function proposed in this paper. Experimental results on benchmark datasets, including Caltech-101 and 15-Scene, show that our algorithm leads to better dictionary representation and classification performance than the baseline methods.

[1]  Thomas S. Huang,et al.  Supervised translation-invariant sparse coding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Thomas S. Huang,et al.  Close the loop: Joint blind image restoration and recognition with sparse representation prior , 2011, 2011 International Conference on Computer Vision.

[3]  Y. Liu,et al.  Bilinear deep learning for image classification , 2011, ACM Multimedia.

[4]  Yu-Chiang Frank Wang,et al.  Exploring self-similarities of bag-of-features for image classification , 2011, MM '11.

[5]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[6]  S. Mallat,et al.  Adaptive greedy approximations , 1997 .

[7]  Jean Ponce,et al.  Task-Driven Dictionary Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[9]  Daphne Koller,et al.  Learning specific-class segmentation from diverse data , 2011, 2011 International Conference on Computer Vision.

[10]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[11]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[12]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms , 1993 .

[13]  Larry S. Davis,et al.  Learning a discriminative dictionary for sparse coding via label consistent K-SVD , 2011, CVPR 2011.

[14]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[15]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[16]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[17]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[19]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms, 3/E. , 2019 .