Generalized iterative RELIEF for supervised distance metric learning

The RELIEF algorithm is a popular approach for feature weighting. Many extensions of the RELIEF algorithm are developed, and I-RELIEF is one of the famous extensions. In this paper, I-RELIEF is generalized for supervised distance metric learning to yield a Mahananobis distance function. The proposed approach is justified by showing that the objective function of the generalized I-RELIEF is closely related to the expected leave-one-out nearest-neighbor classification rate. In addition, the relationships among the generalized I-RELIEF, the neighbourhood components analysis, and graph embedding are also pointed out. Experimental results on various data sets all demonstrate the superiority of the proposed approach.

[1]  Marko Robnik-Sikonja,et al.  Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF , 2004, Applied Intelligence.

[2]  Thomas G. Dietterich Machine-Learning Research Four Current Directions , 1997 .

[3]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[4]  Bruce A. Draper,et al.  Feature selection from huge feature sets , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[5]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[8]  Jun Yang,et al.  Orthogonal Relief Algorithm for Feature Selection , 2006, ICIC.

[9]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[10]  A. Martínez,et al.  The AR face databasae , 1998 .

[11]  Dapeng Wu,et al.  A RELIEF Based Feature Extraction Algorithm , 2008, SDM.

[12]  Fan Chung,et al.  Spectral Graph Theory , 1996 .

[13]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[14]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[15]  Keinosuke Fukunaga,et al.  An Optimal Global Nearest Neighbor Metric , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Mokhtar S. Bazaraa,et al.  Nonlinear Programming: Theory and Algorithms , 1993 .

[17]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[20]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[21]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[22]  Yi Liu,et al.  An Efficient Algorithm for Local Distance Metric Learning , 2006, AAAI.

[23]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[24]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[25]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[26]  Yijun Sun,et al.  Iterative RELIEF for Feature Weighting: Algorithms, Theories, and Applications , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  David W. Aha,et al.  A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms , 1997, Artificial Intelligence Review.

[28]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[29]  T. Poggio,et al.  Multiclass cancer diagnosis using tumor gene expression signatures , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[31]  Kohji Fukunaga,et al.  Introduction to Statistical Pattern Recognition-Second Edition , 1990 .

[32]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[33]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[34]  Raquel Flórez López,et al.  Reviewing RELIEF and its Extensions: A new Approach for Estimating Attributes considering high-correlated Features , 2002, Industrial Conference on Data Mining.

[35]  Ron Kohavi,et al.  Wrappers for feature selection , 1997 .

[36]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[37]  Stephen Lin,et al.  Marginal Fisher Analysis and Its Variants for Human Gait Recognition and Content- Based Image Retrieval , 2007, IEEE Transactions on Image Processing.

[38]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[39]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[40]  Masashi Sugiyama,et al.  Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis , 2007, J. Mach. Learn. Res..

[41]  David G. Stork,et al.  Pattern Classification , 1973 .

[42]  Shuicheng Yan,et al.  A Parameter-Free Framework for General Supervised Subspace Learning , 2007, IEEE Transactions on Information Forensics and Security.