Chernoff distance and Relief feature selection

In classification, a large number of features often make the design of a classifier difficult and degrades its performance. In such situations, feature selection or dimensionality reduction methods play an important role in building classifiers by significantly reducing the number of features. There are many dimensionality reduction techniques for classification in the literature. The most popular one is Fisher's linear discriminant analysis (LDA). For two class problems, LDA simply tries to separate class means as much as possible. For the multi-class case, linear reduction does not guarantee to capture all the relevant information for a classification task. To address this problem, a multi-class problem is cast into a binary problem. The objective becomes to find a subspace where the two classes are well separated. This formulation not only simplifies the problem but also works well in practice. However, it lacks theoretical justification. We show in this paper the connection between the above formulation and RELIEF, thereby providing a sound basis for observed benefits associated with this formulation. Experimental results are provided that corroborate with our analysis.