论文信息 - Large scale multi-class classification with truncated nuclear norm regularization

Large scale multi-class classification with truncated nuclear norm regularization

Abstract In this paper, we consider the problem of multi-class image classification when the classes behaviour has a low rank structure. That is, classes can be embedded into a low dimensional space. Traditional multi-class classification algorithms usually use nuclear norm to approximate the rank of the weight matrix. Considering the limited ability of the nuclear norm for the accurate approximation, we propose a new scalable large scale multi-class classification algorithm by using the recently proposed truncated nuclear norm as a better surrogate of the rank operator of matrices along with multinomial logisitic loss. To solve the non-convex and non-smooth optimization problem, we further develop an efficient iterative procedure. In each iteration, by lifting the non-smooth convex subproblem into an infinite dimensional l 1 norm regularized problem, a simple and efficient accelerated coordinate descent algorithm is applied to find the optimal solution. We conduct a series of evaluations on several public large scale image datasets, where the experimental results show the encouraging improvement of classification accuracy of the proposed algorithm in comparison with the state-of-the-art multi-class classification algorithms.

[1] Thomas Mensink,et al. Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[2] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[3] Paul W. Fieguth,et al. Texture Classification from Random Features , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4] Zheng Yang,et al. Locality-Constrained Concept Factorization , 2011, IJCAI.

[5] Xuelong Li,et al. Matrix completion by Truncated Nuclear Norm Regularization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Florent Perronnin,et al. High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[7] Marc Teboulle,et al. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[8] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.

[9] Cordelia Schmid,et al. Towards good practice in large-scale learning for image classification , 2012, CVPR.

[10] Xuelong Li,et al. A-Optimal Non-negative Projection for image representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Kim-Chuan Toh,et al. SDPT3 — a Matlab software package for semidefinite-quadratic-linear programming, version 3.0 , 2001 .

[12] Sebastian Nowozin,et al. On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[13] Florent Perronnin,et al. Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Haifeng Liu,et al. Non-Negative Matrix Factorization with Constraints , 2010, AAAI.

[15] Bernt Schiele,et al. Evaluating knowledge transfer and zero-shot learning in a large-scale setting , 2011, CVPR 2011.

[16] James T. Kwok,et al. Making Large-Scale Nyström Approximation Possible , 2010, ICML.

[17] K. Chen,et al. Matrix preconditioning techniques and applications , 2005 .

[18] Ming Yang,et al. Large-scale image classification: Fast feature extraction and SVM training , 2011, CVPR 2011.

[19] Ruslan Salakhutdinov,et al. Practical Large-Scale Optimization for Max-norm Regularization , 2010, NIPS.

[20] Zaïd Harchaoui,et al. Lifted coordinate descent for learning with trace-norm regularization , 2012, AISTATS.

[21] Tal Hassner,et al. Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..

[23] Koby Crammer,et al. On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[24] Matthijs Douze,et al. Large-scale image classification with trace-norm regularization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Michael I. Jordan,et al. Convexity, Classification, and Risk Bounds , 2006 .

[26] David G. Lowe,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[27] Shimon Ullman,et al. Uncovering shared structures in multiclass classification , 2007, ICML '07.

[28] Jos F. Sturm,et al. A Matlab toolbox for optimization over symmetric cones , 1999 .

[29] Xuelong Li,et al. Local Coordinate Concept Factorization for Image Representation , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[30] Patrick Gallinari,et al. Ranking with ordered weighted pairwise classification , 2009, ICML '09.