论文信息 - Global Optimization Methods for Extended Fisher Discriminant Analysis

Global Optimization Methods for Extended Fisher Discriminant Analysis

The Fisher discriminant analysis (FDA) is a common technique for binary classication. A parametrized extension, which we call the extended FDA, has been introduced from the viewpoint of robust optimization. In this work, werst give a new probabilistic inter- pretation of the extended FDA. We then de- velop algorithms for solving an optimization problem that arises from the extended FDA: computing the distance between a point and the surface of an ellipsoid. We solve this problem via the KKT points, which we show are obtained by solving a generalized eigen- value problem. We speed up the algorithm by taking advantage of the matrix structure and proving that a globally optimal solution is a KKT point with the smallest Lagrange multiplier, which can be computed efficiently as the leftmost eigenvalue. Numerical exper- iments illustrate the efficiency and effective- ness of the extended FDA model combined with our algorithm.

[1] Pavel Pudil,et al. Introduction to Statistical Pattern Recognition , 2006 .

[2] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.

[3] Walter Gautschi,et al. Statement of priority: “Iteration methods for finding all zeros of a polynomial simultaneously” (Math. Comp. 27 (1973), 339–344) by Oliver Aberth , 1976 .

[4] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .

[5] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.

[6] R. Horst,et al. Global Optimization: Deterministic Approaches , 1992 .

[7] Gene H. Golub,et al. Matrix computations , 1983 .

[8] D. Eberly. Distance from a Point to an Ellipse, an Ellipsoid, or a Hyperellipsoid , 2006 .

[9] Gunnar Rätsch,et al. Constructing Descriptive and Discriminative Nonlinear Features: Rayleigh Coefficients in Kernel Feature Spaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[10] Chiranjib Bhattacharyya,et al. Second Order Cone Programming Formulations for Feature Selection , 2004, J. Mach. Learn. Res..

[11] Oliver Aberth,et al. Iteration methods for finding all zeros of a polynomial simultaneously , 1973 .

[12] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[13] Michael I. Jordan,et al. A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[14] Walter Briec,et al. Minimum Distance to the Complement of a Convex Set: Duality Result , 1997 .

[15] Bernhard Schölkopf,et al. Extension of the nu-SVM range for classification , 2003 .

[16] Takafumi Kanamori,et al. A Unified Classification Model Based on Robust Optimization , 2013, Neural Computation.

[17] Chao Yang,et al. ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.