Systems biology analysis of human genomes points to key pathways conferring spina bifida risk

Significance Genetic investigations of most structural birth defects, including spina bifida (SB), congenital heart disease, and craniofacial anomalies, have been underpowered for genome-wide association studies because of their rarity, genetic heterogeneity, incomplete penetrance, and environmental influences. Our systems biology strategy to investigate SB predisposition controls for population stratification and avoids much of the bias inherent in candidate gene searches that are pervasive in the field. We examine both protein coding and noncoding regions of whole genomes to analyze sequence variants, collapsed by gene or regulatory region, and apply machine learning, gene enrichment, and pathway analyses to elucidate molecular pathways and genes contributing to human SB. Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetrance, and environmental influences that hamper genome-wide association studies approaches to untargeted discovery. Thus, SB genetic studies may suffer from population substructure and/or selection bias introduced by typical candidate gene searches. We report a population based, ancestry-matched whole-genome sequence analysis of SB genetic predisposition using a systems biology strategy to interrogate 298 case-control subject genomes (149 pairs). Genes that were enriched in likely gene disrupting (LGD), rare protein-coding variants were subjected to machine learning analysis to identify genes in which LGD variants occur with a different frequency in cases versus controls and so discriminate between these groups. Those genes with high discriminatory potential for SB significantly enriched pathways pertaining to carbon metabolism, inflammation, innate immunity, cytoskeletal regulation, and essential transcriptional regulation consistent with their having impact on the pathogenesis of human SB. Additionally, an interrogation of conserved noncoding sequences identified robust variant enrichment in regulatory regions of several transcription factors critical to embryonic development. This genome-wide perspective offers an effective approach to the interrogation of coding and noncoding sequence variant contributions to rare complex genetic disorders.

[1]  M. Ross,et al.  Unraveling the complex genetics of neural tube defects: From biological models to human genomics and back , 2021, Genesis.

[2]  E. Elhaik Why most Principal Component Analyses (PCA) in population genetic studies are wrong , 2021, bioRxiv.

[3]  F. Collins,et al.  Precision medicine in 2030—seven ways to transform healthcare , 2021, Cell.

[4]  O. Elemento,et al.  Genome-wide investigation identifies a rare copy-number variant burden associated with human spina bifida , 2021, Genetics in Medicine.

[5]  D. Lawson,et al.  Population genetic considerations for using biobanks as international resources in the pandemic era and beyond , 2020, BMC genomics.

[6]  M. Loeken Mechanisms of Congenital Malformations in Pregnancies with Pre-existing Diabetes , 2020, Current Diabetes Reports.

[7]  R. Lipinski,et al.  Gene-environment interactions: aligning birth defects research with complex etiology. , 2020, Development.

[8]  Hongyan Wang,et al.  Association between rare variants in specific functional pathways and human neural tube defects multiple subphenotypes , 2020, Neural Development.

[9]  C. Hartl,et al.  Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms , 2020, Cell.

[10]  Jesse R. Dixon,et al.  Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer , 2020, Nature Genetics.

[11]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[12]  W. Chung,et al.  The genetics of isolated congenital heart disease , 2019, American journal of medical genetics. Part C, Seminars in medical genetics.

[13]  C. Hartl,et al.  Genetic Control of Expression and Splicing in Developing Human Brain Informs Disease Mechanisms , 2019, Cell.

[14]  M. Ross,et al.  The search for genetic determinants of human neural tube defects. , 2019, Current opinion in pediatrics.

[15]  Christopher Y. Park,et al.  Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk , 2019, Nature Genetics.

[16]  Jing Wang,et al.  WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs , 2019, Nucleic Acids Res..

[17]  Bianca J. Diaz,et al.  Identification of Cancer Drivers at CTCF Insulators in 1,962 Whole Genomes. , 2019, Cell systems.

[18]  Damian Szklarczyk,et al.  STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets , 2018, Nucleic Acids Res..

[19]  Eran Elhaik,et al.  Pair Matcher (PaM): fast model-based optimization of treatment/case-control matches , 2018, Bioinform..

[20]  M. Ross,et al.  Dominant negative GPR161 rare variants are risk factors of human spina bifida , 2018, Human molecular genetics.

[21]  M. Ross,et al.  Threshold for neural tube defect risk by accumulated singleton loss-of-function variants , 2018, Cell Research.

[22]  D. Avramopoulos Recent Advances in the Genetics of Schizophrenia , 2018, Molecular Neuropsychiatry.

[23]  R. Finnell,et al.  Digenic variants of planar cell polarity genes in human neural tube defect patients. , 2018, Molecular genetics and metabolism.

[24]  Raghvendra Mall,et al.  Harnessing Qatar Biobank to understand type 2 diabetes and obesity in adult Qataris from the First Qatar Biobank Project , 2018, Journal of Translational Medicine.

[25]  Cees Dekker,et al.  Real-time imaging of DNA loop extrusion by condensin , 2018, Science.

[26]  N. Greene,et al.  A targeted sequencing panel identifies rare damaging variants in multiple genes in the cranial neural tube defect, anencephaly , 2018, Clinical genetics.

[27]  Mauricio O. Carneiro,et al.  Scaling accurate genetic variant discovery to tens of thousands of samples , 2017, bioRxiv.

[28]  Gonçalo R. Abecasis,et al.  GAS Power Calculator: web-based power calculator for genetic association studies , 2017, bioRxiv.

[29]  Icgc,et al.  Pan-cancer analysis of whole genomes , 2017, bioRxiv.

[30]  G. Rouleau,et al.  Rare deleterious variants in GRHL3 are associated with human spina bifida , 2017, Human mutation.

[31]  Doron Lancet,et al.  GeneHancer: genome-wide integration of enhancers and target genes in GeneCards , 2017, Database J. Biol. Databases Curation.

[32]  J. Gusella,et al.  Rare Deleterious PARD3 Variants in the aPKC‐Binding Region are Implicated in the Pathogenesis of Human Cranial Neural Tube Defects Via Disrupting Apical Tight Junction Formation , 2017, Human mutation.

[33]  Hong‐Lin Chen,et al.  Maternal obesity and the risk of neural tube defects in offspring: A meta-analysis. , 2017, Obesity research & clinical practice.

[34]  C. Bezold,et al.  Diabetes 2030: Insights from Yesterday, Today, and Future Trends , 2017, Population health management.

[35]  C. Mason,et al.  Genomic approaches to the assessment of human spina bifida risk , 2016, Birth defects research.

[36]  S. Yusuf,et al.  Interpreting Geographic Variations in Results of Randomized, Controlled Trials. , 2016, The New England journal of medicine.

[37]  R. Rozen,et al.  Moderate folic acid supplementation and MTHFD1-synthetase deficiency in mice, a model for the R653Q variant, result in embryonic defects and abnormal placental development. , 2016, The American journal of clinical nutrition.

[38]  Yong-Seok Lee,et al.  Cell type-specific roles of RAS-MAPK signaling in learning and memory: Implications in neurodevelopmental disorders , 2016, Neurobiology of Learning and Memory.

[39]  Zhijian J. Chen,et al.  Regulation and function of the cGAS–STING pathway of cytosolic DNA sensing , 2016, Nature Immunology.

[40]  Stephen C. J. Parker,et al.  The genetic architecture of type 2 diabetes , 2016, Nature.

[41]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[42]  H. Canatan,et al.  Recent Advances in Autism Spectrum Disorders: Applications of Whole Exome Sequencing Technology , 2016, Psychiatry investigation.

[43]  Doron Lancet,et al.  GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data , 2016, Omics : a journal of integrative biology.

[44]  Sigal Shachar,et al.  3D Chromosome Regulatory Landscape of Human Pluripotent Cells. , 2016, Cell stem cell.

[45]  Daniel S. Day,et al.  Activation of proto-oncogenes by disruption of chromosome neighborhoods , 2015, Science.

[46]  Dariusz M Plewczynski,et al.  CTCF-Mediated Human 3D Genome Architecture Reveals Chromatin Topology for Transcription , 2015, Cell.

[47]  E. Ostertag,et al.  Myosin Id is required for planar cell polarity in ciliated tracheal and ependymal epithelial cells , 2015, Cytoskeleton.

[48]  William Stafford Noble,et al.  Machine learning applications in genetics and genomics , 2015, Nature Reviews Genetics.

[49]  M. Delignette-Muller,et al.  fitdistrplus: An R Package for Fitting Distributions , 2015 .

[50]  Karsten Suhre,et al.  Evaluation of SNP calling using single and multiple-sample calling algorithms by validation against array base genotyping and Mendelian inheritance , 2014, BMC Research Notes.

[51]  Jill M Dowen,et al.  Control of Cell Identity Genes Occurs in Insulated Neighborhoods in Mammalian Chromosomes , 2014, Cell.

[52]  Elias Mossialos,et al.  The diabetes-obesity-hypertension nexus in Qatar: evidence from the World Health Survey , 2014, Population Health Metrics.

[53]  S. Wells,et al.  Genetic interactions between planar cell polarity genes cause diverse neural tube defects in mice , 2014, Disease Models & Mechanisms.

[54]  Ajay K. Royyuru,et al.  Geographic population structure analysis of worldwide human populations infers their biogeographical origins , 2014, Nature Communications.

[55]  Huiping Zhu,et al.  Identification of Novel CELSR1 Mutations in Spina Bifida , 2014, PloS one.

[56]  M. Dorschner,et al.  Next‐generation sequencing in schizophrenia and other neuropsychiatric disorders , 2013, American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics.

[57]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[58]  R. Finnell,et al.  Neural tube defects, folate, and immune modulation. , 2013, Birth defects research. Part A, Clinical and molecular teratology.

[59]  M. Ross,et al.  Mutations in Planar Cell Polarity Gene SCRIB Are Associated with Spina Bifida , 2013, PloS one.

[60]  Ilan Gronau,et al.  Genome-wide inference of natural selection on human transcription factor binding sites , 2013, Nature Genetics.

[61]  John B. Wallingford,et al.  The Continuing Challenge of Understanding, Preventing, and Treating Neural Tube Defects , 2013, Science.

[62]  R. Rozen,et al.  Moderately high intake of folic acid has a negative impact on mouse embryonic development. , 2013, Birth defects research. Part A, Clinical and molecular teratology.

[63]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[64]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[65]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[66]  Dirk Abel,et al.  Simulation physiologischer Regelkreise mit der objektorientierten Modellbibliothek “HumanLib” , 2011, Autom..

[67]  L. Niswander,et al.  Folic acid supplementation can adversely affect murine neural tube closure and embryonic survival. , 2011, Human molecular genetics.

[68]  Kenneth Lange,et al.  Enhancements to the ADMIXTURE algorithm for individual ancestry estimation , 2011, BMC Bioinformatics.

[69]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[70]  Henriette O'Geen,et al.  ZNF274 Recruits the Histone Methyltransferase SETDB1 to the 3′ Ends of ZNF Genes , 2010, PloS one.

[71]  J. Nadeau,et al.  Functional interactions between the LRP6 WNT co-receptor and folate supplementation. , 2010, Human molecular genetics.

[72]  Tjerk P. Straatsma,et al.  NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations , 2010, Comput. Phys. Commun..

[73]  M. J. Harris,et al.  An update to the list of mouse mutants with neural tube closure defects and advances toward a complete genetic perspective of neural tube closure. , 2010, Birth defects research. Part A, Clinical and molecular teratology.

[74]  M. E. Ross Gene–environment interactions, folate metabolism and the embryonic nervous system , 2010, Wiley interdisciplinary reviews. Systems biology and medicine.

[75]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[76]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[77]  H. Chi,et al.  Regulation of JNK and p38 MAPK in the immune system: signal integration, propagation and termination. , 2009, Cytokine.

[78]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[79]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[80]  M. Fehlings,et al.  The p75 neurotrophin receptor is essential for neuronal cell survival and improvement of functional recovery after spinal cord injury , 2007, Neuroscience.

[81]  B. Győrffy,et al.  Use of routinely collected amniotic fluid for whole-genome expression analysis of polygenic disorders. , 2006, Clinical chemistry.

[82]  Claudine Médigue,et al.  MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes , 2005, Nucleic Acids Res..

[83]  Judith A. Blake,et al.  The Mouse Genome Database (MGD): from genes to mice—a community resource for mouse biology , 2004, Nucleic Acids Res..

[84]  C. Mason,et al.  Mena and Vasodilator-Stimulated Phosphoprotein Are Required for Multiple Actin-Dependent Processes That Shape the Vertebrate Nervous System , 2004, The Journal of Neuroscience.

[85]  M. Tessier-Lavigne,et al.  PTK7/CCK-4 is a novel regulator of planar cell polarity in vertebrates , 2004, Nature.

[86]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[87]  Steven Henikoff,et al.  SIFT: predicting amino acid changes that affect protein function , 2003, Nucleic Acids Res..

[88]  G. Wray,et al.  Abundant raw material for cis-regulatory evolution in humans. , 2002, Molecular biology and evolution.

[89]  F Vinicor,et al.  The continuing epidemics of obesity and diabetes in the United States. , 2001, JAMA.

[90]  Kuo-Fen Lee,et al.  p75 Is Important for Axon Growth and Schwann Cell Migration during Development , 2000, The Journal of Neuroscience.

[91]  Z. Li,et al.  Prevention of neural-tube defects with folic acid in China. China-U.S. Collaborative Project for Neural Tube Defect Prevention. , 2000, The New England journal of medicine.

[92]  M. Mark,et al.  Synergistic activities of alpha3 and alpha6 integrins are required during apical ectodermal ridge formation and organogenesis in the mouse. , 1999, Development.

[93]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[94]  A. Czeizel,et al.  Prevention of the first occurrence of neural-tube defects by periconceptional vitamin supplementation. , 1992, The New England journal of medicine.

[95]  G M Shaw,et al.  Birth defects monitoring in California: a resource for epidemiological research. , 1991, Paediatric and perinatal epidemiology.

[96]  M. Gnant,et al.  Prevention of neural tube defects: Results of the Medical Research Council Vitamin Study , 1991, The Lancet.

[97]  L. Jorde,et al.  Epidemiology and genetics of neural tube defects: an application of the Utah Genealogical Data Base. , 1983, American journal of physical anthropology.

[98]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.