Modeling Cellular Processes with Variational Bayesian Cooperative Vector Quantizer

Gene expression of a cell is controlled by sophisticated cellular processes. The capability of inferring the states of these cellular processes would provide insight into the mechanism of gene expression control system. In this paper, we propose and investigate the cooperative vector quantizer (CVQ) model for analysis of microarray data. The CVQ model could be capable of decomposing observed microarray data into many different regulatory subprocesses. To make the CVQ analysis tractable we develop and apply variational approximations. Bayesian model selection is employed in the model, so that the optimal number processes is determined purely from observed micro-array data. We test the model and algorithms on two datasets: (1) simulated gene-expression data and (2) real-world yeast cell-cycle microarray data. The results illustrate the ability of the CVQ approach to recover and characterize regulatory gene expression subprocesses, indicating a potential for advanced gene expression data analysis.

[1]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Daphne Koller,et al.  Decomposing Gene Expression into Cellular Processes , 2002, Pacific Symposium on Biocomputing.

[3]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[4]  Neil D. Lawrence,et al.  Variational Bayesian Independent Component Analysis , 1999 .

[5]  Hagai Attias,et al.  Inferring Parameters and Structure of Latent Variable Models by Variational Bayes , 1999, UAI.

[6]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[7]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[8]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[9]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[10]  Wolfram Liebermeister,et al.  Linear modes of gene expression determined by independent component analysis , 2002, Bioinform..

[11]  Zoubin Ghahramani,et al.  Factorial Learning and the EM Algorithm , 1994, NIPS.

[12]  David J. C. MacKay,et al.  A decomposition model to track gene expression signatures: preview on observer-independent classification of ovarian cancer , 2002, Bioinform..

[13]  J. W. Miskin,et al.  Ensemble Learning for Blind Source Separation , 2001 .

[14]  Frank J. Manion,et al.  Application of Bayesian Decomposition for analysing microarray data , 2002, Bioinform..

[15]  M. Hauskrecht,et al.  Variational Bayesian Learning of Cooperative Vector Quantizer Model-The Theory , 2002 .

[16]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[17]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[18]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[19]  Charles M. Bishop Variational principal components , 1999 .