Artificial intelligence-based approaches for the detection and prioritization of genomic mutations in congenital surgical diseases

Genetic mutations are critical factors leading to congenital surgical diseases and can be identified through genomic analysis. Early and accurate identification of genetic mutations underlying these conditions is vital for clinical diagnosis and effective treatment. In recent years, artificial intelligence (AI) has been widely applied for analyzing genomic data in various clinical settings, including congenital surgical diseases. This review paper summarizes current state-of-the-art AI-based approaches used in genomic analysis and highlighted some successful applications that deepen our understanding of the etiology of several congenital surgical diseases. We focus on the AI methods designed for the detection of different variant types and the prioritization of deleterious variants located in different genomic regions, aiming to uncover susceptibility genomic mutations contributed to congenital surgical disorders.

[1]  Ahmed Moustafa,et al.  Applications of machine learning in metabolomics: Disease modeling and classification , 2022, Frontiers in Genetics.

[2]  Md. Tabrez Nafis,et al.  A systematic review on machine learning approaches for cardiovascular disease prediction using medical big data. , 2022, Medical engineering & physics.

[3]  X. Zhuang,et al.  Sequencing of a Chinese tetralogy of Fallot cohort reveals clustering mutations in myogenic heart progenitors , 2021, JCI insight.

[4]  Christopher R. Gignoux,et al.  Leveraging Health Systems Data to Characterize a Large Effect Variant Conferring Risk for Liver Disease in Puerto Ricans , 2021, medRxiv.

[5]  J. Shendure,et al.  CADD-Splice—improving genome-wide variant effect prediction using deep learning-derived splice scores , 2021, Genome Medicine.

[6]  P. Sham,et al.  Identification of a wide spectrum of ciliary gene mutations in nonsyndromic biliary atresia patients implicates ciliary dysfunction as a novel disease mechanism , 2021, EBioMedicine.

[7]  Xiurui Hou,et al.  DeepCNV: a deep learning approach for authenticating copy number variations , 2021, Briefings Bioinform..

[8]  P. Sham,et al.  A random forest-based framework for genotyping and accuracy assessment of copy number variations , 2020, NAR genomics and bioinformatics.

[9]  Gabor T. Marth,et al.  Genomic analyses implicate noncoding de novo variants in congenital heart disease , 2020, Nature Genetics.

[10]  K. B. Manheimer,et al.  EM-mosaic detects mosaic point mutations that contribute to congenital heart disease , 2020, Genome Medicine.

[11]  Kevin Y. Yip,et al.  Whole-genome analysis of noncoding genetic variations identifies multiscale regulatory element perturbations associated with Hirschsprung disease , 2020, bioRxiv.

[12]  E. Negri,et al.  Congenital short bowel syndrome: systematic review of a rare condition. , 2020, Journal of pediatric surgery.

[13]  W. Chung,et al.  Genetic Basis of Human Congenital Heart Disease. , 2019, Cold Spring Harbor perspectives in biology.

[14]  E. Perenthaler,et al.  Beyond the Exome: The Non-coding Genome and Enhancers in Neurodevelopmental Disorders and Malformations of Cortical Development , 2019, Front. Cell. Neurosci..

[15]  S. Girirajan,et al.  A machine-learning approach for accurate detection of copy number variants from exome sequencing , 2018, bioRxiv.

[16]  R. Marioni,et al.  An epigenome-wide association study of sex-specific chronological ageing , 2019, Genome Medicine.

[17]  J. Gagneur,et al.  MMSplice: modular modeling improves the predictions of genetic variant effects on splicing , 2019, Genome Biology.

[18]  Kathryn E. Hentges,et al.  Whole Exome Sequencing Reveals the Major Genetic Contributors to Nonsyndromic Tetralogy of Fallot , 2019, Circulation research.

[19]  David G. Knowles,et al.  Predicting Splicing from Primary Sequence with Deep Learning , 2019, Cell.

[20]  W. Chung,et al.  Genetic Basis for Congenital Heart Disease: Revisited: A Scientific Statement From the American Heart Association. , 2018, Circulation.

[21]  Gregory M. Cooper,et al.  CADD: predicting the deleteriousness of variants throughout the human genome , 2018, Nucleic Acids Res..

[22]  Birgit Funke,et al.  ClinGen Variant Curation Expert Panel experiences and standardized processes for disease and gene‐level specification of the ACMG/AMP guidelines for sequence variant interpretation , 2018, Human mutation.

[23]  Thomas Colthurst,et al.  A universal SNP and small-indel variant caller using deep neural networks , 2018, Nature Biotechnology.

[24]  M. You,et al.  Erratum to: Conserved recurrent gene mutations correlate with pathway deregulation and clinical outcomes of lung adenocarcinoma in never-smokers , 2017, BMC Medical Genomics.

[25]  S. Montgomery,et al.  Non-Coding Loss-of-Function Variation in Human Genomes , 2017, Human Heredity.

[26]  Trevor Hastie,et al.  REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. , 2016, American journal of human genetics.

[27]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[28]  O. Troyanskaya,et al.  Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.

[29]  Bale,et al.  Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology , 2015, Genetics in Medicine.

[30]  Yufeng Shen,et al.  Increased Frequency of De Novo Copy Number Variants in Congenital Heart Disease by Integrative Analysis of Single Nucleotide Polymorphism Array and Exome Sequence Data , 2014, Circulation research.

[31]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[32]  J. Shendure,et al.  A general framework for estimating the relative pathogenicity of human genetic variants , 2014, Nature Genetics.

[33]  Mauricio O. Carneiro,et al.  From FastQ Data to High‐Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline , 2013, Current protocols in bioinformatics.

[34]  E. Banks,et al.  Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. , 2012, American journal of human genetics.

[35]  C. Marshall,et al.  Genome-Wide Copy Number Analysis Uncovers a New HSCR Gene: NRG3 , 2012, PLoS genetics.

[36]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.

[37]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[38]  A. Redington,et al.  Tetralogy of Fallot , 2009, The Lancet.

[39]  Robert H. Anderson,et al.  Tetralogy of Fallot , 2009, Orphanet journal of rare diseases.

[40]  Joseph T. Glessner,et al.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. , 2007, Genome research.

[41]  D. Cooper,et al.  The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: Causes and consequences , 1992, Human Genetics.

[42]  Louise V Wain,et al.  Copy number variation. , 2011, Methods in molecular biology.