integRATE: a desirability-based data integration framework for the prioritization of candidate genes across heterogeneous omics and its application to preterm birth

BackgroundThe integration of high-quality, genome-wide analyses offers a robust approach to elucidating genetic factors involved in complex human diseases. Even though several methods exist to integrate heterogeneous omics data, most biologists still manually select candidate genes by examining the intersection of lists of candidates stemming from analyses of different types of omics data that have been generated by imposing hard (strict) thresholds on quantitative variables, such as P-values and fold changes, increasing the chance of missing potentially important candidates.MethodsTo better facilitate the unbiased integration of heterogeneous omics data collected from diverse platforms and samples, we propose a desirability function framework for identifying candidate genes with strong evidence across data types as targets for follow-up functional analysis. Our approach is targeted towards disease systems with sparse, heterogeneous omics data, so we tested it on one such pathology: spontaneous preterm birth (sPTB).ResultsWe developed the software integRATE, which uses desirability functions to rank genes both within and across studies, identifying well-supported candidate genes according to the cumulative weight of biological evidence rather than based on imposition of hard thresholds of key variables. Integrating 10 sPTB omics studies identified both genes in pathways previously suspected to be involved in sPTB as well as novel genes never before linked to this syndrome. integRATE is available as an R package on GitHub (https://github.com/haleyeidem/integRATE).ConclusionsDesirability-based data integration is a solution most applicable in biological research areas where omics data is especially heterogeneous and sparse, allowing for the prioritization of candidate genes that can be used to inform more targeted downstream functional analyses.

[1]  Rita Verhelst,et al.  Imbalances between Matrix Metalloproteinases (MMPs) and Tissue Inhibitor of Metalloproteinases (TIMPs) in Maternal Serum during Preterm Labor , 2012, PloS one.

[2]  Pieter J. de Jong,et al.  Dysferlin, a novel skeletal muscle gene, is mutated in Miyoshi myopathy and limb girdle muscular dystrophy , 1998, Nature Genetics.

[3]  Joris M. Mooij,et al.  MAGMA: Generalized Gene-Set Analysis of GWAS Data , 2015, PLoS Comput. Biol..

[4]  Tony Pawson,et al.  Albumin decrease is associated with spontaneous preterm delivery within 48 h in women with threatened preterm labor. , 2015, Journal of proteome research.

[5]  M. K. Veerapen,et al.  The Genetics of Preterm Birth , 2016 .

[6]  Jonathan A. Cooper,et al.  Effects of CapZ, an actin capping protein of muscle, on the polymerization of actin. , 1989, Biochemistry.

[7]  Alan A Cohen,et al.  Effects of categorization method, regression type, and variable distribution on the inflation of Type‐I error rate when categorizing a confounding variable , 2015, Statistics in medicine.

[8]  Shoichiro Ono,et al.  Functions of actin-interacting protein 1 (AIP1)/WD repeat protein 1 (WDR1) in actin filament dynamics and cytoskeletal regulation. , 2017, Biochemical and biophysical research communications.

[9]  Antonis Rokas,et al.  Gestational tissue transcriptomics in term and preterm human pregnancies: a systematic review and meta-analysis , 2015, BMC Medical Genomics.

[10]  Anushya Muruganujan,et al.  PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements , 2016, Nucleic Acids Res..

[11]  Eric E. Schadt,et al.  Moving toward a system genetics view of disease , 2007, Mammalian Genome.

[12]  Scott M. Williams,et al.  Preterm Birth in Caucasians Is Associated with Coagulation and Inflammation Pathway Gene Variants , 2008, PloS one.

[13]  Velia M. Fowler,et al.  Actin dynamics at pointed ends regulates thin filament length in striated muscle , 2001, Nature Cell Biology.

[14]  G. Derringer,et al.  Simultaneous Optimization of Several Response Variables , 1980 .

[15]  David L Streiner,et al.  Breaking up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data , 2002, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[16]  I. Morag,et al.  Activity of Von Willebrand factor and levels of VWF-cleaving protease (ADAMTS13) in preterm and full term neonates. , 2017, Blood cells, molecules & diseases.

[17]  G. Hon,et al.  Next-generation genomics: an integrative approach , 2010, Nature Reviews Genetics.

[18]  Hon Nian Chua,et al.  Whole Blood Gene Expression Profile Associated with Spontaneous Preterm Birth in Women with Threatened Preterm Labor , 2014, PloS one.

[19]  Kypros H Nicolaides,et al.  First‐trimester maternal serum matrix metalloproteinase‐9 (MMP‐9) and adverse pregnancy outcome , 2009, Prenatal diagnosis.

[20]  Allan Peter Davis,et al.  Genetic and environmental pathways to complex diseases , 2009, BMC Systems Biology.

[21]  M. Ritchie,et al.  Integrating heterogeneous high-throughput data for meta-dimensional pharmacogenomics and disease-related studies. , 2012, Pharmacogenomics.

[22]  A. Dunning,et al.  Beyond GWASs: illuminating the dark road from association to function. , 2013, American journal of human genetics.

[23]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[24]  Dmitriy Sonkin,et al.  Integrative modeling of multi-omics data to identify cancer drivers and infer patient-specific gene activity , 2016, BMC Systems Biology.

[25]  Antonis Rokas,et al.  The transformative potential of an integrative approach to pregnancy. , 2017, Placenta.

[26]  Joris M. Mooij,et al.  MAGMA: Generalized Gene-Set Analysis of GWAS Data , 2015, PLoS Comput. Biol..

[27]  Cédric Coulonges,et al.  Genetic Associations with Spontaneous Preterm Birth. , 2017, The New England journal of medicine.

[28]  Joshua M. Korn,et al.  Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2008, Nature.

[29]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[30]  Satoru Miyano,et al.  ACTN1 mutations cause congenital macrothrombocytopenia. , 2013, American journal of human genetics.

[31]  Marco Seri,et al.  ACTN1-related thrombocytopenia: identification of novel families for phenotypic characterization. , 2015, Blood.

[32]  J. V. D. van der Post,et al.  The idiopathic preterm delivery methylation profile in umbilical cord blood DNA , 2015, BMC Genomics.

[33]  Varun Kilaru,et al.  Fetal DNA Methylation Associates with Early Spontaneous Preterm Birth and Gestational Age , 2013, PloS one.

[34]  P. Chidiac,et al.  Regulation of RGS5 GAP activity by GPSM3 , 2015, Molecular and Cellular Biochemistry.

[35]  M. Ritchie,et al.  Methods of integrating data to uncover genotype–phenotype interactions , 2015, Nature Reviews Genetics.

[36]  Konrad J. Karczewski,et al.  Integrative omics for health and disease , 2018, Nature Reviews Genetics.

[37]  Jacob Cohen The Cost of Dichotomization , 1983 .

[38]  Tak Y. Leung,et al.  Systematic Identification of Spontaneous Preterm Birth-Associated RNA Transcripts in Maternal Plasma , 2012, PloS one.

[39]  S. Joshi,et al.  Matrix Metalloproteinase-1 and -9 in Human Placenta during Spontaneous Vaginal Delivery and Caesarean Sectioning in Preterm Pregnancy , 2012, PloS one.

[40]  Jing Wang,et al.  Empowering biologists with multi-omics data: colorectal cancer as a paradigm , 2014, Bioinform..

[41]  Matthew K Hoffman,et al.  Development and validation of a spontaneous preterm delivery predictor in asymptomatic women. , 2016, American journal of obstetrics and gynecology.

[42]  Louis J. Muglia,et al.  Genetic contributions to preterm birth: Implications from epidemiological and genetic association studies , 2008, Annals of medicine.

[43]  Jane E. Norman,et al.  The preterm cervix reveals a transcriptomic signature in the presence of premature prelabor rupture of membranes , 2017, American journal of obstetrics and gynecology.

[44]  Ju Han Kim,et al.  Synergistic effect of different levels of genomic data for cancer clinical outcome prediction , 2012, J. Biomed. Informatics.

[45]  D. Olson,et al.  Changes in matrix metalloproteinase (MMP)-2 and MMP-9 in the fetal amnion and chorion during gestation and at term and preterm labor. , 2006, Placenta.

[46]  Xiaoting Chen,et al.  Genetic Associations with Gestational Duration and Spontaneous Preterm Birth , 2018 .

[47]  F Ghezzi,et al.  Matrix metalloproteinases-9 in preterm and term human parturition. , 1999, The Journal of maternal-fetal medicine.

[48]  Stanley E. Lazic,et al.  Ranking, selecting, and prioritising genes with desirability functions , 2015, PeerJ.

[49]  Ping Xu,et al.  Expression of matrix metalloproteinase (MMP)-2 and MMP-9 in human placenta and fetal membranes in relation to preterm and term labor. , 2002, The Journal of clinical endocrinology and metabolism.

[50]  N. Chegini,et al.  Expression Profile of MicroRNAs and mRNAs in Human Placentas From Pregnancies Complicated by Preeclampsia and Preterm Labor , 2011, Reproductive Sciences.

[51]  Giuseppe Matullo,et al.  Renin-angiotensin-aldosterone system polymorphisms: a role or a hole in occurrence and long-term prognosis of acute myocardial infarction at young age , 2007, BMC Medical Genetics.

[52]  Alexander Lex,et al.  UpSetR: an R package for the visualization of intersecting sets and their properties , 2017, bioRxiv.

[53]  S. Fisher,et al.  Preterm labor: One syndrome, many causes , 2014, Science.

[54]  Ao Li,et al.  Discovery of Bladder Cancer-related Genes Using Integrative Heterogeneous Network Modeling of Multi-omics Data , 2017, Scientific Reports.

[55]  Antonio Federico,et al.  Transcriptome Profiling in Human Diseases: New Advances and Perspectives , 2017, International journal of molecular sciences.

[56]  M. Delgado-Rodríguez,et al.  Systematic review and meta-analysis. , 2017, Medicina intensiva.

[57]  Stefan Johansson,et al.  Assessing the Causal Relationship of Maternal Height on Birth Size and Gestational Age at Birth: A Mendelian Randomization Analysis , 2015, PLoS medicine.

[58]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[59]  Louis J Muglia,et al.  The enigma of spontaneous preterm birth. , 2010, The New England journal of medicine.

[60]  David M. Reif,et al.  Integrated analysis of genetic, genomic and proteomic data , 2004, Expert review of proteomics.

[61]  Wendy P. Robinson,et al.  Cord blood hematopoietic cells from preterm infants display altered DNA methylation patterns , 2017, Clinical Epigenetics.

[62]  Roberto Romero,et al.  Fetal plasma MMP-9 concentrations are elevated in preterm premature rupture of the membranes. , 2002, American journal of obstetrics and gynecology.

[63]  Hongkai Ji,et al.  Genome-wide DNA methylation associations with spontaneous preterm birth in US blacks: findings in maternal and cord blood samples , 2018, Epigenetics.

[64]  A. Ricco,et al.  Dynamic platelet function on von Willebrand factor is different in preterm neonates and full‐term neonates: changes in neonatal platelet function , 2016, Journal of thrombosis and haemostasis : JTH.

[65]  Antonis Rokas,et al.  Comprehensive RNA profiling of villous trophoblast and decidua basalis in pregnancies complicated by preterm birth following intra-amniotic infection. , 2016, Placenta.

[66]  Raouf A Khalil,et al.  Matrix Metalloproteinases in Normal Pregnancy and Preeclampsia. , 2017, Progress in molecular biology and translational science.

[67]  N Mohandas,et al.  Stomatocytosis is absent in "stomatin"-deficient murine red blood cells. , 1999, Blood.

[68]  C. Greenwood,et al.  Data Integration in Genetics and Genomics: Methods and Challenges , 2009, Human genomics and proteomics : HGP.

[69]  Hugo Y. K. Lam,et al.  Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes , 2012, Cell.

[70]  C George,et al.  A Balancing Act: Optimizing a Product's Properties , 1994 .

[71]  A. Lusis,et al.  Considerations for the design of omics studies , 2017 .

[72]  Yujing J. Heng,et al.  Human cervicovaginal fluid biomarkers to predict term and preterm labor , 2015, Front. Physiol..

[73]  G. V. Paolini,et al.  Quantifying the chemical beauty of drugs. , 2012, Nature chemistry.

[74]  Alicia Oshlack,et al.  Analysis of epigenetic changes in survivors of preterm birth reveals the effect of gestational age and evidence for a long term legacy , 2013, Genome Medicine.

[75]  Ramkumar Menon,et al.  Amniotic Fluid Metabolomic Analysis in Spontaneous Preterm Birth , 2014, Reproductive Sciences.