Integrative correlation: Properties and relation to canonical correlations

The integrative correlation coefficient was developed to facilitate the validation of expression microarray results in public datasets, by identifying genes that are reproducibly measured across studies and even across microarray platforms. In the current study, we develop a number of interesting and important mathematical and statistical properties of the integrative correlation coefficient, including a unique permutation-based null distribution with the unusual property that the variance does not shrink as the sample size increases, discussing how these findings impact its use and interpretation, and what they have to say about any method for identifying reproducible genes in a meta-analysis.

[1]  M. Gerstein,et al.  What is a gene, post-ENCODE? History and updated definition. , 2007, Genome research.

[2]  Joseph Beyene,et al.  Using the ratio of means as the effect size measure in combining results of microarray experiments , 2009, BMC Systems Biology.

[3]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[4]  Christopher R. Cabanski,et al.  Lung Squamous Cell Carcinoma mRNA Expression Subtypes Are Reproducible, Clinically Important, and Correspond to Normal Cell Types , 2010, Clinical Cancer Research.

[5]  Andrew B. Nobel,et al.  Merging two gene-expression studies via cross-platform normalization , 2008, Bioinform..

[6]  Rainer Breitling,et al.  A comparison of meta-analysis methods for detecting differentially expressed genes in microarray experiments , 2008, Bioinform..

[7]  Gregory C. Chow A Theorem on Least Squares and Vector Correlation in Multivariate Linear Regression , 1966 .

[8]  Jae K. Lee,et al.  The COXEN principle: translating signatures of in vitro chemosensitivity into tools for clinical outcome prediction and drug discovery in cancer. , 2010, Cancer research.

[9]  Jae K. Lee,et al.  Prospective Comparison of Clinical and Genomic Multivariate Predictors of Response to Neoadjuvant Chemotherapy in Breast Cancer , 2010, Clinical Cancer Research.

[10]  J. Foekens,et al.  Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer , 2005, The Lancet.

[11]  Tim R. Mercer,et al.  Expression of distinct RNAs from 3′ untranslated regions , 2010, Nucleic acids research.

[12]  S. C. Lakhotia,et al.  What is a gene? , 1997 .

[13]  Jae K. Lee,et al.  A strategy for predicting the chemosensitivity of human cancers and its application to drug discovery , 2007, Proceedings of the National Academy of Sciences.

[14]  Fabricio F Costa,et al.  Non-coding RNAs: could they be the answer? , 2011, Briefings in functional genomics.

[15]  Gregory D. Schuler,et al.  ESTablishing a human transcript map , 1995, Nature Genetics.

[16]  Homin K. Lee,et al.  Coexpression analysis of human genes across many microarray data sets. , 2004, Genome research.

[17]  Marina Vannucci,et al.  Advances in statistical bioinformatics : models and integrative inference for high-throughput data , 2013 .

[18]  F. Ayala,et al.  Pseudogenes: are they "junk" or functional DNA? , 2003, Annual review of genetics.

[19]  C. Ponting,et al.  Transcribed dark matter: meaning or myth? , 2010, Human molecular genetics.

[20]  Jae K. Lee,et al.  Use of yeast chemigenomics and COXEN informatics in preclinical evaluation of anticancer agents. , 2011, Neoplasia.

[21]  Elizabeth Garrett-Mayer,et al.  Cross-study validation and combined analysis of gene expression microarray data. , 2007, Biostatistics.

[22]  C. Greenwood,et al.  Data Integration in Genetics and Genomics: Methods and Challenges , 2009, Human genomics and proteomics : HGP.

[23]  Jae K. Lee,et al.  Multigene Expression–Based Predictors for Sensitivity to Vorinostat and Velcade in Non–Small Cell Lung Cancer , 2010, Molecular Cancer Therapeutics.

[24]  Jean Yee Hwa Yang,et al.  Comparison study of microarray meta-analysis methods , 2010, BMC Bioinformatics.

[25]  Elizabeth Garrett-Mayer,et al.  OPTIMIZED CROSS-STUDY ANALYSIS OF MICROARRAY-BASED PREDICTORS , 2007 .

[26]  H. Parkinson,et al.  Large scale comparison of global gene expression patterns in human and mouse , 2010, Genome Biology.

[27]  Stefano Monti,et al.  Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[28]  Giovanni Parmigiani,et al.  A Cross-Study Comparison of Gene Expression Studies for the Molecular Classification of Lung Cancer , 2004, Clinical Cancer Research.

[29]  Fuad G. Gwadry,et al.  Comparing cDNA and oligonucleotide array data: concordance of gene expression across platforms for the NCI-60 cancer cells , 2003, Genome Biology.

[30]  Rainer Breitling,et al.  RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis , 2006, Bioinform..

[31]  Jia Li,et al.  Biomarker detection in the integration of multiple multi-class genomic studies , 2010, Bioinform..