Joint analysis of multiple cDNA microarray studies via multivariate mixed models applied to genetic improvement of beef cattle.

In functional genomic laboratories, it is common to use the same microarray slide across studies, each investigating a unique biological question, and each analyzed separately due to computational limitations and/or because there is no hybridization of samples from different studies on one slide. However, the question of analyzing data from multiple studies is a major current issue in microarray data analysis because there are gains to be made in the accuracy of estimated effects by exploiting a covariance structure between gene expression data across studies. We propose an approach for combining multiple studies using multivariate mixed models, with the assumption of a nonzero correlation among genes across experiments, while imposing a null residual covariance. We applied this method to jointly analyze three experiments in genetics of cattle with a total of 54 arrays, each with 19,200 spots and 7,638 elements. The resulting seven-variate model contains 752,476 equations and 56 covariances. To identify differentially expressed genes, we applied model-based clustering to a linear combination of the random gene x variety interaction effect. We enhanced the biological interpretation of the results by applying an iterative algorithm to identify the gene ontology classes that significantly changed in each experiment. We found 118 elements with coordinate expression that clustered into distinct biological functions such as adipogenesis and protein turnover. These results contribute to our understanding of the mechanistic processes involved in adipogenesis and nutrient partitioning.

[1]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  G. McLachlan On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture , 1987 .

[4]  G. Laurent,et al.  Dynamic state of collagen: pathways of collagen degradation in vivo and their possible role in regulation of collagen mass. , 1987, The American journal of physiology.

[5]  D. Stram,et al.  Variance components testing in the longitudinal mixed effects model. , 1994, Biometrics.

[6]  M. Krupsky,et al.  Regulation of Type I Collagen mRNA by Amino Acid Deprivation in Human Lung Fibroblasts* , 1997, The Journal of Biological Chemistry.

[7]  Yixin Wang,et al.  POWER_SAGE: comparing statistical tests for SAGE experiments , 2000, Bioinform..

[8]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[9]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[10]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  F. Speleman,et al.  Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes , 2002, Genome Biology.

[12]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[13]  Pierre R. Bushel,et al.  Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models , 2001, J. Comput. Biol..

[14]  T. Barrette,et al.  Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. , 2002, Cancer research.

[15]  T. Speed,et al.  Design issues for cDNA microarray experiments , 2002, Nature Reviews Genetics.

[16]  Ken W. Y. Cho,et al.  Microarray optimizations: increasing spot accuracy and automated identification of true microarray signals. , 2002, Nucleic acids research.

[17]  T. Nakajima,et al.  Large‐Scale Screening for Candidate Genes of Ossification of the Posterior Longitudinal Ligament of the Spine , 2002, Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research.

[18]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[19]  A. Vardanyan,et al.  Multivariate approach for selecting sets of differentially expressed genes. , 2002, Mathematical biosciences.

[20]  M. Goddard,et al.  Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes , 2004, Genetics Selection Evolution.

[21]  M Kathleen Kerr,et al.  Design considerations for efficient and effective microarray studies. , 2003, Biometrics.

[22]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[23]  Patrick F Sullivan,et al.  False discoveries and models for gene discovery. , 2003, Trends in genetics : TIG.

[24]  Dale L. Wilson,et al.  New Normalization Methods for CDNA Microarray Data , 2003, Bioinform..

[25]  W. D. de Jong,et al.  The Small Heat-shock Protein αB-Crystallin Promotes FBX4-dependent Ubiquitination* , 2003, The Journal of Biological Chemistry.

[26]  J. Townsend,et al.  BMC Genomics BioMed Central Methodology article , 2003 .

[27]  Sangsoo Kim,et al.  Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.

[28]  Adrian Wiestner,et al.  A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  A. Reverter,et al.  A mixture model-based cluster analysis of DNA microarray gene expression data on Brahman and Brahman composite steers fed high-, medium-, and low-quality diets. , 2003, Journal of animal science.

[30]  B. De Moor,et al.  Comparison and meta-analysis of microarray data: from the bench to the computer desk. , 2003, Trends in genetics : TIG.

[31]  X. Cui,et al.  Statistical tests for differential expression in cDNA microarray experiments , 2003, Genome Biology.

[32]  Claire Tilstone DNA microarrays: Vital statistics , 2003, Nature.

[33]  C. Kendziorski,et al.  The efficiency of pooling mRNA in microarray experiments. , 2003, Biostatistics.

[34]  Debashis Ghosh,et al.  Statistical issues and methods for meta-analysis of microarray data: a case study in prostate cancer , 2003, Functional & Integrative Genomics.

[35]  R. Fernando,et al.  Controlling the Proportion of False Positives in Multiple Dependent Tests , 2004, Genetics.

[36]  Leif Andersson,et al.  Domestic-animal genomics: deciphering the genetics of complex traits , 2004, Nature Reviews Genetics.

[37]  P. Brown,et al.  Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Y. H. Wang,et al.  Development and application of a bovine cDNA microarray for expression profiling of muscle and adipose tissue , 2004 .

[39]  Ka Yee Yeung,et al.  Bayesian mixture model based clustering of replicated microarray data , 2004, Bioinform..

[40]  A. Reverter,et al.  A mixed-model approach for the analysis of cDNA microarray gene expression data from extreme-performing pigs after infection with Actinobacillus pleuropneumoniae. , 2004, Journal of animal science.

[41]  Brian P. Dalrymple,et al.  A rapid method for computationally inferring transcriptome coverage and microarray sensitivity , 2005, Bioinform..