A Feature Selection Wrapper for Mixtures

We propose a feature selection approach for clustering which extends Koller and Sahami’s mutual-information-based criterion to the unsupervised case. This is achieved with the help of a mixture-based model and the corresponding expectation-maximization algorithm. The result is a backward search scheme, able to sort the features by order of relevance. Finally, an MDL criterion is used to prune the sorted list of features, yielding a feature selection criterion. The proposed approach can be classified as a wrapper, since it wraps the mixture estimation algorithm in an outer layer that performs feature selection. Preliminary experimental results show that the proposed method has promising performance.

[1]  Gerard V. Trunk,et al.  A Problem of Dimensionality: A Simple Example , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[3]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[4]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identi cation for Unsupervised Learning , 2022 .

[5]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[7]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[8]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[9]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[10]  Filippo Menczer,et al.  Feature selection in unsupervised learning via evolutionary search , 2000, KDD '00.

[11]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[12]  Ashwin Ram,et al.  Efficient Feature Selection in Conceptual Clustering , 1997, ICML.

[13]  Huan Liu,et al.  Feature Selection for Clustering , 2000, Encyclopedia of Database Systems.

[14]  Geoffrey J. McLachlan,et al.  Mixture models : inference and applications to clustering , 1989 .

[15]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[16]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17]  Carla E. Brodley,et al.  Feature Subset Selection and Order Identification for Unsupervised Learning , 2000, ICML.

[18]  Shivakumar Vaithyanathan,et al.  Generalized Model Selection for Unsupervised Learning in High Dimensions , 1999, NIPS.

[19]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  J. Rissanen Stochastic Complexity in Statistical Inquiry Theory , 1989 .

[21]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .