Simultaneous Bayesian Clustering and Feature Selection Through Student’s ${t}$ Mixtures Model

In this paper, we proposed a generative model for feature selection under the unsupervised learning context. The model assumes that data are independently and identically sampled from a finite mixture of Student’s ${t}$ distributions, which can reduce the sensitiveness to outliers. Latent random variables that represent the features’ salience are included in the model for the indication of the relevance of features. As a result, the model is expected to simultaneously realize clustering, feature selection, and outlier detection. Inference is carried out by a tree-structured variational Bayes algorithm. Full Bayesian treatment is adopted in the model to realize automatic model selection. Controlled experimental studies showed that the developed model is capable of modeling the data set with outliers accurately. Furthermore, experiment results showed that the developed algorithm compares favorably against existing unsupervised probability model-based Bayesian feature selection algorithms on artificial and real data sets. Moreover, the application of the developed algorithm on real leukemia gene expression data indicated that it is able to identify the discriminating genes successfully.

[1]  J. Downing,et al.  Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. , 2002, Cancer cell.

[2]  Nikhil R. Pal,et al.  Feature Selection Using a Neural Framework With Controlled Redundancy , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Jianyong Sun,et al.  A Fast Algorithm for Robust Mixtures in the Presence of Measurement Errors , 2007, IEEE Transactions on Neural Networks.

[4]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[6]  Chidchanok Lursinsap,et al.  A Discrimination Analysis for Unsupervised Feature Selection via Optic Diffraction Principle , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Huan Liu,et al.  Semi-supervised Feature Selection via Spectral Analysis , 2007, SDM.

[8]  Huan Liu,et al.  Spectral feature selection for supervised and unsupervised learning , 2007, ICML '07.

[9]  Shichao Zhang,et al.  Robust Joint Graph Sparse Coding for Unsupervised Spectral Feature Selection , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[10]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[11]  Yuhui Zheng,et al.  Image segmentation by generalized hierarchical fuzzy C-means algorithm , 2015, J. Intell. Fuzzy Syst..

[12]  Josiane Mothe,et al.  Nonconvex Regularizations for Feature Selection in Ranking With Sparse SVM , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Xingming Sun,et al.  Fast Motion Estimation Based on Content Property for Low-Complexity H.265/HEVC Encoder , 2016, IEEE Transactions on Broadcasting.

[14]  Xingming Sun,et al.  Achieving Efficient Cloud Search Services: Multi-Keyword Ranked Search over Encrypted Cloud Data Supporting Parallel Computing , 2015, IEICE Trans. Commun..

[15]  Yueting Zhuang,et al.  Graph Regularized Feature Selection with Data Reconstruction , 2016, IEEE Transactions on Knowledge and Data Engineering.

[16]  José Manuel Benítez,et al.  Consistency measures for feature selection , 2008, Journal of Intelligent Information Systems.

[17]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[18]  Aristidis Likas,et al.  Bayesian feature and model selection for Gaussian mixture models , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Xuelong Li,et al.  Joint Embedding Learning and Sparse Regression: A Framework for Unsupervised Feature Selection , 2014, IEEE Transactions on Cybernetics.

[20]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[22]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[23]  Witold Pedrycz,et al.  Unsupervised feature selection via maximum projection and minimum redundancy , 2015, Knowl. Based Syst..

[24]  Jing Hua,et al.  Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yi Jiang,et al.  Eigenvalue Sensitive Feature Selection , 2011, ICML.

[26]  Ling Shao,et al.  Feature Learning for Image Classification Via Multiobjective Genetic Programming , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[27]  Feiping Nie,et al.  Effective Discriminative Feature Selection With Nontrivial Solution , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Yulong Wang,et al.  Sparse Coding From a Bayesian Perspective , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Ivor W. Tsang,et al.  The Emerging "Big Dimensionality" , 2014, IEEE Computational Intelligence Magazine.

[30]  Richard Nock,et al.  A hybrid filter/wrapper approach of feature selection using information theory , 2002, Pattern Recognit..

[31]  A. Ng Feature selection, L1 vs. L2 regularization, and rotational invariance , 2004, Twenty-first international conference on Machine learning - ICML '04.

[32]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[33]  Yiu-ming Cheung,et al.  Feature Selection and Kernel Learning for Local Learning-Based Clustering , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Zenglin Xu,et al.  Discriminative Semi-Supervised Feature Selection Via Manifold Regularization , 2009, IEEE Transactions on Neural Networks.

[35]  Nando de Freitas,et al.  Bayesian Feature Weighting for Unsupervised Learning, with Application to Object Recognition , 2003, AISTATS.

[36]  Jonathan M. Garibaldi,et al.  Robust mixture modeling using the Pearson type VII distribution , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[37]  Wei Pan,et al.  Penalized Model-Based Clustering with Application to Variable Selection , 2007, J. Mach. Learn. Res..

[38]  Shuang-Hong Yang,et al.  Ieee Transactions on Knowledge and Data Engineering, Vol. X, No. X Discriminative Feature Selection by Nonparametric Bayes Error Minimization , 2022 .

[39]  Jing Liu,et al.  Unsupervised Feature Selection Using Nonnegative Spectral Analysis , 2012, AAAI.

[40]  Lei Wang,et al.  Global and Local Structure Preservation for Feature Selection , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Lei Wang,et al.  On Similarity Preserving Feature Selection , 2013, IEEE Transactions on Knowledge and Data Engineering.

[42]  Qi Mao,et al.  Feature selection for unsupervised learning through local learning , 2015, Pattern Recognit. Lett..

[43]  Jennie Si,et al.  FREL: A Stable Feature Selection Algorithm , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[44]  Jing Liu,et al.  Clustering-Guided Sparse Structural Learning for Unsupervised Feature Selection , 2014, IEEE Transactions on Knowledge and Data Engineering.

[45]  Deng Cai,et al.  Laplacian Score for Feature Selection , 2005, NIPS.

[46]  Beat Pfister,et al.  A Semidefinite Programming Based Search Strategy for Feature Selection with Mutual Information Measure , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Quanquan Gu,et al.  Local Learning Regularized Nonnegative Matrix Factorization , 2009, IJCAI.