COMBINATIONAL MIXTURES OF MULTIPARAMETER DISTRIBUTIONS

We introduce combinatorial mixtures —a flexible class of models for inference on mixture distributions whose component have multidimensional parameters. The key idea is to allow each element of the component-specific parameter vectors to be shared by a subset of other components. This approach allows for mixtures that range from very flexible to very parsimonious, and unifies inference on component-specific parameters with inference on the number of components. We develop Bayesian inference and computation approaches for this class of distributions, and illustrate them in an application. This work was originally motivated by the analysis of cancer subtypes: in terms of biological measures of interest, subtypes may characterized by differences in location, scale, correlations or any of the combinations. We illustrate our approach using data on molecular subtypes of lung cancer. Some key words: Bayesian inference, Markov chain Monte Carlo, Clustering. 1 Hosted by The Berkeley Electronic Press

[1]  D. B. Dahl Modal clustering in a class of product partition models , 2009 .

[2]  Raphael Gottardo,et al.  Markov Chain Monte Carlo With Mixtures of Mutually Singular Distributions , 2008 .

[3]  A. Nobile Bayesian finite mixtures: a note on prior specification and posterior computation , 2007, 0711.0458.

[4]  A. Fearnside Bayesian analysis of finite mixture distributions using the allocation sampler , 2007 .

[5]  Marina Vannucci,et al.  Variable selection in clustering via Dirichlet process mixture models , 2006 .

[6]  Adrian E. Raftery,et al.  Computing Normalizing Constants for Finite Mixture Models via Incremental Mixture Importance Sampling (IMIS) , 2006 .

[7]  Giovanni Parmigiani,et al.  Searching for differentially expressed gene combinations , 2005, Genome Biology.

[8]  Kerby Shedden,et al.  Differential Correlation Detects Complex Associations Between Gene Expression and Clinical Outcomes in Lung Adenocarcinomas , 2005 .

[9]  Jean-Michel Marin,et al.  Bayesian Modelling and Inference on Mixtures of Distributions , 2005 .

[10]  G. Casella,et al.  Mixture models, latent variables and partitioned importance sampling , 2004 .

[11]  Wilfried Seidel,et al.  Editorial: recent developments in mixture models , 2003, Comput. Stat. Data Anal..

[12]  David E. Misek,et al.  Gene-expression profiles predict survival of patients with lung adenocarcinoma , 2002, Nature Medicine.

[13]  M. Tyers,et al.  Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. , 2002, Cancer research.

[14]  L. Liotta,et al.  Laser capture microdissection and microarray expression analysis of lung adenocarcinoma reveals tobacco smoking- and prognosis-related molecular profiles. , 2002, Cancer research.

[15]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[16]  D. Botstein,et al.  Diversity of gene expression in adenocarcinoma of the lung , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[17]  S. Frühwirth-Schnatter Markov chain Monte Carlo Estimation of Classical and Dynamic Switching and Mixture Models , 2001 .

[18]  E. Lander,et al.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[20]  Xiao-Li Meng,et al.  Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage , 2000 .

[21]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[22]  M. Stephens Dealing with label switching in mixture models , 2000 .

[23]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[24]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[25]  E. M. Crowley Product Partition Models for Normal Means , 1997 .

[26]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[27]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[28]  Adrian E. Raftery,et al.  Hypothesis testing and model selection , 1996 .

[29]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[30]  Walter R. Gilks,et al.  Bayesian model comparison via jump diffusions , 1995 .

[31]  Walter R. Gilks,et al.  Hypothesis testing and model selection , 1995 .

[32]  B. Carlin,et al.  Bayesian Model Choice Via Markov Chain Monte Carlo Methods , 1995 .

[33]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[34]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[35]  Michael A. West,et al.  Deconvolution of Mixtures in Analysis of Neural Synaptic Transmission , 1994 .

[36]  J. Hartigan,et al.  Product Partition Models for Change Point Problems , 1992 .

[37]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[38]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[39]  Wei-Chien Chang On using Principal Components before Separating a Mixture of Two Multivariate Normal Distributions , 1983 .

[40]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[41]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[42]  S. Newcomb A Generalized Theory of the Combination of Observations so as to Obtain the Best Result , 1886 .