Empowering Multi-Cohort Gene Expression Analysis to Increase Reproducibility

A major contributor to the scientific reproducibility crisis has been that the results from homogeneous, single-center studies do not generalize to heterogeneous, real world populations. Multi-cohort gene expression analysis has helped to increase reproducibility by aggregating data from diverse populations into a single analysis. To make the multi-cohort analysis process more feasible, we have assembled an analysis pipeline which implements rigorously studied meta-analysis best practices. We have compiled and made publicly available the results of our own multi-cohort gene expression analysis of 103 diseases, spanning 615 studies and 36,915 samples, through a novel and interactive web application. As a result, we have made both the process of and the results from multi-cohort gene expression analysis more approachable for non-technical users.

[1]  M. S. Patel,et al.  An introduction to meta-analysis. , 1989, Health Policy.

[2]  G. Hong,et al.  Nucleic Acids Research , 2015, Nucleic Acids Research.

[3]  Andreas Nitsche,et al.  Erratum , 1984, Clinical Neurology and Neurosurgery.

[4]  W. E. Gye,et al.  CANCER RESEARCH , 1923, British medical journal.

[5]  E. Hall,et al.  The nature of biotechnology. , 1988, Journal of biomedical engineering.

[6]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[7]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[8]  T. Macdonald,et al.  Cytokine & Growth Factor Reviews , 2015 .

[9]  Nature Genetics , 1991, Nature.

[10]  BMC Bioinformatics , 2005 .

[11]  L. Wilkinson Immunity , 1891, The Lancet.

[12]  M. Plotkin Nature as medicine. , 2005, Explore.

[13]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.