Identifying Patients with Atrioventricular Septal Defect in Down Syndrome Populations by Using Self-Normalizing Neural Networks and Feature Selection

Atrioventricular septal defect (AVSD) is a clinically significant subtype of congenital heart disease (CHD) that severely influences the health of babies during birth and is associated with Down syndrome (DS). Thus, exploring the differences in functional genes in DS samples with and without AVSD is a critical way to investigate the complex association between AVSD and DS. In this study, we present a computational method to distinguish DS patients with AVSD from those without AVSD using the newly proposed self-normalizing neural network (SNN). First, each patient was encoded by using the copy number of probes on chromosome 21. The encoded features were ranked by the reliable Monte Carlo feature selection (MCFS) method to obtain a ranked feature list. Based on this feature list, we used a two-stage incremental feature selection to construct two series of feature subsets and applied SNNs to build classifiers to identify optimal features. Results show that 2737 optimal features were obtained, and the corresponding optimal SNN classifier constructed on optimal features yielded a Matthew’s correlation coefficient (MCC) value of 0.748. For comparison, random forest was also used to build classifiers and uncover optimal features. This method received an optimal MCC value of 0.582 when top 132 features were utilized. Finally, we analyzed some key features derived from the optimal features in SNNs found in literature support to further reveal their essential roles.

[1]  Thomas Danner,et al.  Phosphodiesterase 9A Controls Nitric-oxide Independent cGMP and Hypertrophic Heart Disease , 2015, Nature.

[2]  Xiaofan Ding,et al.  Application of Machine Learning to Development of Copy Number Variation-based Prediction of Cancer Risk , 2014, Genomics insights.

[3]  Tao Huang,et al.  Analysis of cancer-related lncRNAs using gene ontology and KEGG pathways , 2017, Artif. Intell. Medicine.

[4]  Michael J. Pazzani,et al.  An Investigation of Noise-Tolerant Relational Concept Learning Algorithms , 1991, ML.

[5]  Bi-Qing Li,et al.  Prediction of Linear B-Cell Epitopes with mRMR Feature Selection and Analysis , 2016 .

[6]  Hongbin Shen,et al.  Large-scale prediction of human protein-protein interactions from amino acid sequence based on latent topic features. , 2010, Journal of proteome research.

[7]  Michael E Zwick,et al.  Analysis of Copy Number Variants on Chromosome 21 in Down Syndrome-Associated Congenital Heart Defects , 2017, G3: Genes, Genomes, Genetics.

[8]  Michael E Zwick,et al.  Contribution of Copy Number Variation to Down Syndrome-associated Atrioventricular Septal Defects , 2014, Genetics in Medicine.

[9]  H. Nakazawa,et al.  Analysis of prognostic factors related to primary superficial bladder cancer tumor recurrence in prophylactic intravesical epirubicin therapy , 1999, International journal of urology : official journal of the Japanese Urological Association.

[10]  Lei Chen,et al.  Gene expression profiling gut microbiota in different races of humans , 2016, Scientific Reports.

[11]  N. Shimizu,et al.  Analysis of the promoter region of human placenta-specific DSCR4 gene. , 2008, Biochimica et biophysica acta.

[12]  R. M. Álvarez-Gómez,et al.  Germline Mutations in NKX2-5, GATA4, and CRELD1 are Rare in a Mexican Sample of Down Syndrome Patients with Endocardial Cushion and Septal Heart Defects , 2015, Pediatric Cardiology.

[13]  Kuo-Chen Chou,et al.  Prediction of Protein Domain with mRMR Feature Selection and Analysis , 2012, PloS one.

[14]  Adam B. Olshen,et al.  A classification model for distinguishing copy number variants from cancer-related alterations , 2010, BMC Bioinformatics.

[15]  Mohammed A. Al-Biltagi,et al.  Echocardiography in children with Down syndrome. , 2013, World journal of clinical pediatrics.

[16]  J. Delabar,et al.  C21orf5, a human candidate gene for brain abnormalities and mental retardation in Down syndrome , 2005, Cytogenetic and Genome Research.

[17]  S Minoshima,et al.  Isolation and characterization of a human chromosome 21q22.3 gene (WDR4) and its mouse homologue that code for a WD-repeat protein. , 2000, Genomics.

[18]  Razvan Pascanu,et al.  Understanding the exploding gradient problem , 2012, ArXiv.

[19]  J M Delabar,et al.  Classification of human chromosome 21 gene-expression variations in Down syndrome: impact on disease phenotypes. , 2007, American journal of human genetics.

[20]  Heather J Ross,et al.  Transplantation and Mechanical Circulatory Support in Congenital Heart Disease: A Scientific Statement From the American Heart Association. , 2016, Circulation.

[21]  Chen Chu,et al.  Prediction and analysis of cell-penetrating peptides using pseudo-amino acid composition and random forest models , 2015, Amino Acids.

[22]  Stylianos E. Antonarakis,et al.  Down syndrome and the complexity of genome dosage imbalance , 2016, Nature Reviews Genetics.

[23]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Lei Chen,et al.  Identification of Drug-Drug Interactions Using Chemical Interactions , 2017 .

[25]  Mariane B Bermudez,et al.  Down syndrome: Prevalence and distribution of congenital heart disease in Brazil. , 2015, Sao Paulo medical journal = Revista paulista de medicina.

[26]  A. Serraf,et al.  Endogenous nitric oxide production and atrial natriuretic peptide biological activity in infants undergoing cardiac operations. , 1997, Critical care medicine.

[27]  Diego A Lara,et al.  Public health research in congenital heart disease. , 2014, Congenital heart disease.

[28]  Mohammed Rachidi,et al.  C21orf5, a new member of Dopey family involved in morphogenesis, could participate in neurological alterations and mental retardation in Down syndrome. , 2005, DNA research : an international journal for rapid publication of reports on genes and genomes.

[29]  Mercedes Gil-Campos,et al.  Cardiac Biomarkers of Low Cardiac Output Syndrome in the Postoperative Period After Congenital Heart Disease Surgery in Children. , 2017, Revista espanola de cardiologia.

[30]  Sandra da Silva Mattos,et al.  Prevalence and profile of congenital heart disease and pulmonary hypertension in Down syndrome in a pediatric cardiology service , 2014, Revista paulista de pediatria : orgao oficial da Sociedade de Pediatria de Sao Paulo.

[31]  M. Hurles,et al.  Copy number variation in human health, disease, and evolution. , 2009, Annual review of genomics and human genetics.

[32]  Jialiang Yang,et al.  Identify Key Sequence Features to Improve CRISPR sgRNA Efficacy , 2017, IEEE Access.

[33]  Stephan Eliez,et al.  Copy-Number Variation of the Glucose Transporter Gene SLC2A3 and Congenital Heart Defects in the 22q11.2 Deletion Syndrome. , 2015, American journal of human genetics.

[34]  Kuo-Chen Chou,et al.  RSARF: prediction of residue solvent accessibility from protein sequence using random forest method. , 2012, Protein and peptide letters.

[35]  Roberto Sacco,et al.  Genome-wide expression studies in Autism spectrum disorder, Rett syndrome, and Down syndrome , 2012, Neurobiology of Disease.

[36]  M R Speicher,et al.  Tetrasomy 21pter→q21.2 in a male infant without typical Down’s syndrome dysmorphic features but moderate mental retardation , 2004, Journal of Medical Genetics.

[37]  Johannes Fürnkranz,et al.  Incremental Reduced Error Pruning , 1994, ICML.

[38]  Jung Min Ko,et al.  Korean Circulation Journal , 2022 .

[39]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[40]  S. Y. Park,et al.  Non-Invasive Epigenetic Detection of Fetal Trisomy 21 in First Trimester Maternal Plasma , 2011, PloS one.

[41]  P. Suganthan,et al.  AFP-Pred: A random forest approach for predicting antifreeze proteins from sequence-derived properties. , 2011, Journal of theoretical biology.

[42]  Tanya Barrett,et al.  The Gene Expression Omnibus Database , 2016, Statistical Genomics.

[43]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[44]  Lei Lu,et al.  Mechanisms of ciliary targeting: entering importins and Rabs , 2017, Cellular and Molecular Life Sciences.

[45]  Li Zhang,et al.  Genetic analysis of Down syndrome-associated heart defects in mice , 2011, Human Genetics.

[46]  Jan Komorowski,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm486 Data and text mining Monte Carlo , 2022 .

[47]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[48]  Lei Chen,et al.  Identification of gene expression signatures across different types of neural stem cells with the Monte‐Carlo feature selection method , 2018, Journal of cellular biochemistry.

[49]  M. Gorenflo,et al.  Metabolites of the L-arginine-NO pathway in patients with left-to-right shunt. , 2001, Clinical laboratory.

[50]  F. Pelegri,et al.  Calcium signaling in vertebrate embryonic patterning and morphogenesis. , 2007, Developmental biology.

[51]  Taghi M. Khoshgoftaar,et al.  Deep learning applications and challenges in big data analytics , 2015, Journal of Big Data.

[52]  Sujoy Ghosh,et al.  Polymorphic haplotypes of CRELD1 differentially predispose Down syndrome and euploids individuals to atrioventricular septal defect , 2012, American journal of medical genetics. Part A.

[53]  Michael J Ackerman,et al.  Eligibility and Disqualification Recommendations for Competitive Athletes With Cardiovascular Abnormalities: Task Force 4: Congenital Heart Disease: A Scientific Statement From the American Heart Association and American College of Cardiology. , 2015, Journal of the American College of Cardiology.

[54]  Fernando F. Costa,et al.  High expression of the cGMP‐specific phosphodiesterase, PDE9A, in sickle cell disease (SCD) and the effects of its inhibition in erythroid cells and SCD neutrophils , 2008, British journal of haematology.

[55]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[56]  David S. Johnson,et al.  Approximation algorithms for combinatorial problems , 1973, STOC.

[57]  Junmin Pan,et al.  Mechanism of ciliary disassembly , 2016, Cellular and Molecular Life Sciences.

[58]  Ian H. Witten,et al.  Stacking Bagged and Dagged Models , 1997, ICML.

[59]  J. Bonagura,et al.  Sequential segmental classification of feline congenital heart disease. , 2015, Journal of veterinary cardiology : the official journal of the European Society of Veterinary Cardiology.

[60]  W. Ambrosius,et al.  Application of Random Forests Methods to Diabetic Retinopathy Classification Analyses , 2014, PloS one.

[61]  Lei Wang,et al.  Bioinformatic Analysis of Genes and MicroRNAs Associated With Atrioventricular Septal Defect in Down Syndrome Patients. , 2016, International heart journal.

[62]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[63]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[64]  Lin Lu,et al.  Predicting Citrullination Sites in Protein Sequences Using mRMR Method and Random Forest Algorithm. , 2017, Combinatorial chemistry & high throughput screening.

[65]  Luc Mertens,et al.  Echocardiographic Features Defining Right Dominant Unbalanced Atrioventricular Septal Defect: A Multi-institutional Congenital Heart Surgeons’ Society Study , 2013, Circulation. Cardiovascular imaging.

[66]  K. Chou,et al.  Predicting Anatomical Therapeutic Chemical (ATC) Classification of Drugs by Integrating Chemical-Chemical Interactions and Similarities , 2012, PloS one.

[67]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[68]  Carl D Langefeld,et al.  Genetic factors predisposing to systemic lupus erythematosus and lupus nephritis. , 2010, Seminars in nephrology.

[69]  Kuo-Chen Chou,et al.  Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition , 2010, BMC Bioinformatics.

[70]  Jens Dreyhaupt,et al.  Plasma L-arginine and metabolites of nitric oxide synthase in patients with left-to-right shunt after intracardiac repair. , 2005, Chest.

[71]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[72]  Bingbing Ni,et al.  Unsupervised Deep Learning for Optical Flow Estimation , 2017, AAAI.

[73]  Marco Seri,et al.  Genotype-phenotype correlation for congenital heart disease in Down syndrome through analysis of partial trisomy 21 cases. , 2017, Genomics.

[74]  Hong-Bin Shen,et al.  IPMiner: hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction , 2016, BMC Genomics.

[75]  Changsheng Li,et al.  On Estimating Air Pollution from Photos Using Convolutional Neural Network , 2016, ACM Multimedia.

[76]  Ravi Ashwath,et al.  Down Syndrome with Complete Atrioventricular Septal Defect, Hypertrophic Cardiomyopathy, and Pulmonary Vein Stenosis. , 2015, Texas Heart Institute journal.

[77]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[78]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[79]  Hong-Bin Shen,et al.  RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach , 2016, BMC Bioinformatics.

[80]  Tao Huang,et al.  Identification of compound–protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds , 2016, Molecular Genetics and Genomics.

[81]  B. Gelb,et al.  Genetic basis of syndromes associated with congenital heart disease. , 2001, Current opinion in cardiology.

[82]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[83]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[84]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[85]  Helmut Baumgartner,et al.  Eisenmenger syndrome and long-term survival in patients with Down syndrome and congenital heart disease , 2016, Heart.