A DCA Based Algorithm for Feature Selection in Model-Based Clustering

Gaussian Mixture Models (GMM) is a model-based clustering approach which has been used in many applications thanks to its flexibility and effectiveness. However, in high dimension data, GMM based clustering lost its advantages due to over-parameterization and noise features. To deal with this issue, we incorporate feature selection into GMM clustering. For the first time, a non-convex sparse inducing regularization is considered for feature selection in GMM clustering. The resulting optimization problem is nonconvex for which we develop a DCA (Difference of Convex functions Algorithm) to solve. Numerical experiments on several benchmark and synthetic datasets illustrate the efficiency of our algorithm and its superiority over an EM method for solving the GMM clustering using \(l_1\) regularization.

[1]  Christian Hennig,et al.  What are the true clusters? , 2015, Pattern Recognit. Lett..

[2]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[3]  Ji Zhu,et al.  Variable Selection for Model‐Based High‐Dimensional Clustering and Its Application to Microarray Data , 2008, Biometrics.

[4]  Paul D. McNicholas,et al.  Model-Based Clustering , 2016, Journal of Classification.

[5]  E. Levina,et al.  Pairwise Variable Selection for High‐Dimensional Model‐Based Clustering , 2010, Biometrics.

[6]  Le Thi Hoai An,et al.  Sparse Signal Recovery by Difference of Convex Functions Algorithms , 2013, ACIIDS.

[7]  Le Thi Hoai An,et al.  Sparse semi-supervised support vector machines by DC programming and DCA , 2015, Neurocomputing.

[8]  Le Thi Hoai An,et al.  A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem , 1998, SIAM J. Optim..

[9]  Joaquim Júdice,et al.  On the solution of the symmetric eigenvalue complementarity problem by the spectral projected gradient algorithm , 2008, Numerical Algorithms.

[10]  B. Grün Model-Based Clustering , 2019, Handbook of Mixture Analysis.

[11]  Le Thi Hoai An,et al.  DC approximation approaches for sparse optimization , 2014, Eur. J. Oper. Res..

[12]  Paul D. McNicholas,et al.  A LASSO-penalized BIC for mixture model selection , 2012, Advances in Data Analysis and Classification.

[13]  Charles Bouveyron,et al.  Model-based clustering of high-dimensional data: A review , 2014, Comput. Stat. Data Anal..

[14]  Le Thi Hoai An,et al.  DC programming and DCA: thirty years of developments , 2018, Math. Program..

[15]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[16]  Wei Pan,et al.  Penalized Model-Based Clustering with Application to Variable Selection , 2007, J. Mach. Learn. Res..

[17]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[18]  Le Thi Hoai An,et al.  The DC (Difference of Convex Functions) Programming and DCA Revisited with DC Models of Real World Nonconvex Optimization Problems , 2005, Ann. Oper. Res..

[19]  Xiaotong Shen,et al.  Penalized model-based clustering with unconstrained covariance matrices. , 2009, Electronic journal of statistics.

[20]  Le Thi Hoai An,et al.  A DC programming approach for feature selection in support vector machines learning , 2008, Adv. Data Anal. Classif..

[21]  Pradeep Ravikumar,et al.  QUIC: quadratic approximation for sparse inverse covariance estimation , 2014, J. Mach. Learn. Res..

[22]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[23]  Daniel Ståhl,et al.  Model‐based cluster analysis , 2012 .

[24]  Duy Nhat Phan,et al.  DC programming and DCA for sparse optimal scoring problem , 2016 .