Global Optimization Methods for Extended Fisher Discriminant Analysis

The Fisher discriminant analysis (FDA) is a common technique for binary classication. A parametrized extension, which we call the extended FDA, has been introduced from the viewpoint of robust optimization. In this work, werst give a new probabilistic inter- pretation of the extended FDA. We then de- velop algorithms for solving an optimization problem that arises from the extended FDA: computing the distance between a point and the surface of an ellipsoid. We solve this problem via the KKT points, which we show are obtained by solving a generalized eigen- value problem. We speed up the algorithm by taking advantage of the matrix structure and proving that a globally optimal solution is a KKT point with the smallest Lagrange multiplier, which can be computed efficiently as the leftmost eigenvalue. Numerical exper- iments illustrate the efficiency and effective- ness of the extended FDA model combined with our algorithm.

[1]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Walter Gautschi,et al.  Statement of priority: “Iteration methods for finding all zeros of a polynomial simultaneously” (Math. Comp. 27 (1973), 339–344) by Oliver Aberth , 1976 .

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  Bernhard Schölkopf,et al.  New Support Vector Algorithms , 2000, Neural Computation.

[6]  R. Horst,et al.  Global Optimization: Deterministic Approaches , 1992 .

[7]  Gene H. Golub,et al.  Matrix computations , 1983 .

[8]  D. Eberly Distance from a Point to an Ellipse, an Ellipsoid, or a Hyperellipsoid , 2006 .

[9]  Gunnar Rätsch,et al.  Constructing Descriptive and Discriminative Nonlinear Features: Rayleigh Coefficients in Kernel Feature Spaces , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Chiranjib Bhattacharyya,et al.  Second Order Cone Programming Formulations for Feature Selection , 2004, J. Mach. Learn. Res..

[11]  Oliver Aberth,et al.  Iteration methods for finding all zeros of a polynomial simultaneously , 1973 .

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[14]  Walter Briec,et al.  Minimum Distance to the Complement of a Convex Set: Duality Result , 1997 .

[15]  Bernhard Schölkopf,et al.  Extension of the nu-SVM range for classification , 2003 .

[16]  Takafumi Kanamori,et al.  A Unified Classification Model Based on Robust Optimization , 2013, Neural Computation.

[17]  Chao Yang,et al.  ARPACK users' guide - solution of large-scale eigenvalue problems with implicitly restarted Arnoldi methods , 1998, Software, environments, tools.