Fusing microarray experiments with multivariate regression

MOTIVATION It is widely acknowledged that microarray data are subject to high noise levels and results are often platform dependent. Therefore, microarray experiments should be replicated several times and in several laboratories before the results can be relied upon. To make the best use of such extensive datasets, methods for microarray data fusion are required. Ideally, the fused data should distil important aspects of the data while suppressing unwanted sources of variation and be amenable to further informal and formal methods of analysis. Also, the variability in the quality of experimentation should be taken into account. RESULTS We present such an approach to data fusion, based on multivariate regression. We apply our methodology to data from a previous study on cell-cycle control in Schizosaccharomyces pombe. AVAILABILITY The algorithm implemented in R is freely available from the authors on request.

[1]  G. Parmigiani,et al.  A statistical framework for expression‐based molecular classification in cancer , 2002 .

[2]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[3]  Stephen J. Tapscott,et al.  Differentially Expressed Genes Using Genomic Expression Profiles An Efficient and Robust Statistical Modeling Approach to Discover , 2022 .

[4]  David J. C. MacKay,et al.  A decomposition model to track gene expression signatures: preview on observer-independent classification of ovarian cancer , 2002, Bioinform..

[5]  Kam D. Dahlquist,et al.  Regression Approaches for Microarray Data Analysis , 2002, J. Comput. Biol..

[6]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[7]  T. Speed,et al.  A multivariate empirical Bayes statistic for replicated microarray time course data , 2006, math/0702685.

[8]  Baolin Wu,et al.  Differential gene expression detection using penalized linear regression models: the improved SAM statistics , 2005, Bioinform..

[9]  P. Lio’,et al.  Periodic gene expression program of the fission yeast cell cycle , 2004, Nature Genetics.

[10]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[12]  Anders Berglund,et al.  A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription , 2003, Bioinform..

[13]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Huey-Miin Hsueh,et al.  A Generalized Additive Model For Microarray Gene Expression Data Analysis , 2004, Journal of biopharmaceutical statistics.

[15]  Sangsoo Kim,et al.  Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.

[16]  Petri Auvinen,et al.  Are data from different gene expression microarray platforms comparable? , 2004, Genomics.

[17]  K. Gabriel,et al.  The biplot graphic display of matrices with application to principal component analysis , 1971 .

[18]  F Mosteller,et al.  Meta-analysis of multiple outcomes by regression with random effects. , 1998, Statistics in medicine.

[19]  K MallickBani,et al.  Gene selection using a two-level hierarchical Bayesian model , 2004 .

[20]  Gordon K. Smyth,et al.  limmaGUI: A graphical user interface for linear modeling of microarray data , 2004, Bioinform..

[21]  Marina Vannucci,et al.  Bayesian Variable Selection in Multinomial Probit Models to Identify Molecular Signatures of Disease Stage , 2004, Biometrics.

[22]  Bani K. Mallick,et al.  Gene selection using a two-level hierarchical Bayesian model , 2004, Bioinform..

[23]  J. Hoheisel,et al.  Correspondence analysis applied to microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Peter J. Park,et al.  Combining gene expression data from different generations of oligonucleotide arrays , 2004, BMC Bioinformatics.

[25]  Colin C. Pritchard,et al.  Bayesian integrated functional analysis of microarray data , 2004, Bioinform..