MAP Clustering under the Gaussian Mixture Model via Mixed Integer Nonlinear Optimization.

We present a global optimization approach for solving the maximum a-posteriori (MAP) clustering problem under the Gaussian mixture model.Our approach can accommodate side constraints and it preserves the combinatorial structure of the MAP clustering problem by formulating it asa mixed-integer nonlinear optimization problem (MINLP). We approximate the MINLP through a mixed-integer quadratic program (MIQP) transformation that improves computational aspects while guaranteeing $\epsilon$-global optimality. An important benefit of our approach is the explicit quantification of the degree of suboptimality, via the optimality gap, en route to finding the globally optimal MAP clustering. Numerical experiments comparing our method to other approaches show that our method finds a better solution than standard clustering methods. Finally, we cluster a real breast cancer gene expression data set incorporating intrinsic subtype information; the induced constraints substantially improve the computational performance and produce more coherent and bio-logically meaningful clusters.

[1]  F. Glover IMPROVED LINEAR INTEGER PROGRAMMING FORMULATIONS OF NONLINEAR INTEGER PROBLEMS , 1975 .

[2]  J. Meeraus A. Bisschop,et al.  ON THE DEVELOPMENT OF A GENERAL ALGEBRAIC MODELING SYSTEM IN A STRATEGIC PLANNING ENVIRONMENT , 1982 .

[3]  Katta G. Murty,et al.  Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[4]  David Kendrick,et al.  GAMS, a user's guide , 1988, SGNM.

[5]  D B Rubin,et al.  Markov chain Monte Carlo methods in biostatistics , 1996, Statistical methods in medical research.

[6]  J. Doye,et al.  Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms , 1997, cond-mat/9803344.

[7]  D. Hunter,et al.  Optimization Transfer Using Surrogate Objective Functions , 2000 .

[8]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[9]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[11]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[12]  Nikolaos V. Sahinidis,et al.  A polyhedral branch-and-cut approach to global optimization , 2005, Math. Program..

[13]  Thorsten Koch,et al.  Branching rules revisited , 2005, Oper. Res. Lett..

[14]  D. Ross,et al.  Basal cytokeratins and their relationship to the cellular origin and functional classification of breast cancer , 2005, Breast Cancer Research.

[15]  Kathrin Klamroth,et al.  Biconvex sets and optimization with biconvex functions: a survey and extensions , 2007, Math. Methods Oper. Res..

[16]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[17]  A. Nobel,et al.  Supervised risk predictor of breast cancer based on intrinsic subtypes. , 2009, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[18]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[19]  Ailsa H. Land,et al.  An Automatic Method of Solving Discrete Programming Problems , 1960 .

[20]  T. Nielsen,et al.  Breast cancer subtypes and the risk of local and regional relapse. , 2010, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[21]  M. Ringnér,et al.  The gene expression landscape of breast cancer is shaped by tumor protein p53 status and epithelial-mesenchymal transition , 2012, Breast Cancer Research.

[22]  Christian Kirches,et al.  Mixed-integer nonlinear optimization*† , 2013, Acta Numerica.

[23]  Julien Mairal,et al.  Optimization with First-Order Surrogate Functions , 2013, ICML.

[24]  Ted K. Ralphs,et al.  Integer and Combinatorial Optimization , 2013 .

[25]  Chong Wang,et al.  Variational inference in nonconjugate models , 2012, J. Mach. Learn. Res..

[26]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[27]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[28]  George L. Nemhauser,et al.  How important are branching decisions: Fooling MIP solvers , 2015, Oper. Res. Lett..

[29]  Martin J. Wainwright,et al.  Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences , 2016, NIPS.

[30]  Dimitris Bertsimas,et al.  OR Forum - An Algorithmic Approach to Linear Regression , 2016, Oper. Res..

[31]  Dimitris Bertsimas,et al.  Logistic Regression: From Art to Science , 2017 .

[32]  Yaroslav D. Sergeyev,et al.  Deterministic Global Optimization , 2017 .

[33]  INFORMS Analytics Body of Knowledge , 2018 .

[34]  Andrew McCallum,et al.  Compact Representation of Uncertainty in Clustering , 2018, NeurIPS.

[35]  Ignacio E. Grossmann,et al.  A review and comparison of solvers for convex MINLP , 2018, Optimization and Engineering.

[36]  Lili Wang,et al.  Loss of human arylamine N-acetyltransferase I regulates mitochondrial function by inhibition of the pyruvate dehydrogenase complex. , 2019, The international journal of biochemistry & cell biology.

[37]  D. Bertsimas,et al.  Learning a Mixture of Gaussians via Mixed-Integer Optimization , 2019, INFORMS Journal on Optimization.

[38]  Dimitris Bertsimas,et al.  Scalable holistic linear regression , 2019, Oper. Res. Lett..