A Unified Probabilistic Model for Global and Local Unsupervised Feature Selection

Existing algorithms for joint clustering and feature selection can be categorized as either global or local approaches. Global methods select a single cluster-independent subset of features, whereas local methods select cluster-specific subsets of features. In this paper, we present a unified probabilistic model that can perform both global and local feature selection for clustering. Our approach is based on a hierarchical beta-Bernoulli prior combined with a Dirichlet process mixture model. We obtain global or local feature selection by adjusting the variance of the beta prior. We provide a variational inference algorithm for our model. In addition to simultaneously learning the clusters and features, this Bayesian formulation allows us to learn both the number of clusters and the number of features to retain. Experiments on synthetic and real data show that our unified model can find global and local features and cluster data as well as competing methods of each type.

[1]  G. Celeux,et al.  Variable Selection for Clustering with Gaussian Mixture Models , 2009, Biometrics.

[2]  Luis Talavera,et al.  Feature Selection as a Preprocessing Step for Hierarchical Clustering , 1999, ICML.

[3]  Ashwin Ram,et al.  Efficient Feature Selection in Conceptual Clustering , 1997, ICML.

[4]  J. Hartigan Direct Clustering of a Data Matrix , 1972 .

[5]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[6]  Evangelos E. Milios,et al.  Latent Dirichlet Co-Clustering , 2006, Sixth International Conference on Data Mining (ICDM'06).

[7]  Shivakumar Vaithyanathan,et al.  Model Selection in Unsupervised Learning with Applications To Document Clustering , 1999, International Conference on Machine Learning.

[8]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[9]  Philip S. Yu,et al.  /spl delta/-clusters: capturing subspace correlation in a large data set , 2002, Proceedings 18th International Conference on Data Engineering.

[10]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[11]  Inderjit S. Dhillon,et al.  Minimum Sum-Squared Residue Co-Clustering of Gene Expression Data , 2004, SDM.

[12]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[13]  C. Ball,et al.  Genetic and physical maps of Saccharomyces cerevisiae. , 1997, Nature.

[14]  Aristidis Likas,et al.  Bayesian feature and model selection for Gaussian mixture models , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[16]  Anil K. Jain,et al.  Feature Selection in Mixture-Based Clustering , 2002, NIPS.

[17]  Inderjit S. Dhillon,et al.  A generalized maximum entropy approach to bregman co-clustering and matrix approximation , 2004, J. Mach. Learn. Res..

[18]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[19]  Huan Liu,et al.  Feature selection for clustering - a filter solution , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[20]  Lawrence Carin,et al.  A Bayesian approach to unsupervised feature selection and density estimation using expectation propagation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[22]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[23]  E. Xing,et al.  A HIERARCHICAL DIRICHLET PROCESS MIXTURE MODEL FOR HAPLOTYPE RECONSTRUCTION FROM MULTI-POPULATION DATA , 2008, 0812.4648.

[24]  Qiang Fu,et al.  Bayesian Overlapping Subspace Clustering , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[25]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[26]  Filippo Menczer,et al.  Evolutionary model selection in unsupervised learning , 2002, Intell. Data Anal..