Inference of Autism-Related Genes by Integrating Protein-Protein Interactions and miRNA-Target Interactions

Autism spectrum disorders (ASD) are a group of conditions characterized by impairments in social interaction and presence of repetitive behavior. These complex neurological diseases are among the fastest growing developmental disorders and cause varying degrees of lifelong disabilities. There have been a lot of ongoing research to unravel the pathogenic mechanism of autism. Computational methods have come to the scene as a promising approach to aid the physicians in studying autism. In this paper, we present an efficient method to predict autism-related candidate genes (autism genes in short) by integrating protein interaction network and miRNA-target interaction network. We combine the two networks by a new technique relying on shortest path calculation. To demonstrate the high performance of our method, we run several experiments on three different PPI networks extracted from the BioGRID database, the HINT database, and the HPRD database. Three supervised learning algorithms were employed, i.e., the Bayesian network and the random tree and the random forest. Among them, the random forest method performs better in terms of precision, recall, and F-measure. It shows that the random forest algorithm is potential to infer autism genes. Carrying out the experiments with five different lengths of the shortest paths in the PPI networks, the results show the advantage of the method in studying autism genes based on the large scale network. In conclusion, the proposed method is beneficial in deciphering the pathogenic mechanism of autism.

[1]  Yongjin Li,et al.  Discovering disease-genes by topological features in human protein-protein interaction network , 2006, Bioinform..

[2]  Ron Edgar,et al.  Mining microarray data at NCBI's Gene Expression Omnibus (GEO)*. , 2006, Methods in molecular biology.

[3]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[4]  Hans-Peter Kriegel,et al.  Graph Kernels For Disease Outcome Prediction From Protein-Protein Interaction Networks , 2006, Pacific Symposium on Biocomputing.

[5]  Q. Cui,et al.  An Analysis of Human MicroRNA and Disease Associations , 2008, PloS one.

[6]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[7]  Christine Le Signor,et al.  UTILLdb, a Pisum sativum in silico forward and reverse genetics tool , 2008, Genome Biology.

[8]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[9]  J. Kocerha,et al.  The Path to microRNA Therapeutics in Psychiatric and Neurodegenerative Disorders , 2012, Front. Gene..

[10]  P. Radivojac,et al.  An integrated approach to inferring gene–disease associations in humans , 2008, Proteins.

[11]  D. Bartel MicroRNAs Genomics, Biogenesis, Mechanism, and Function , 2004, Cell.

[12]  A. Hatzigeorgiou,et al.  TarBase: A comprehensive database of experimentally supported animal microRNA targets. , 2005, RNA.

[13]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2013 , 2012, Nucleic Acids Res..

[16]  R. Sharan,et al.  Protein networks in disease. , 2008, Genome research.

[17]  Ron Shamir,et al.  MetaReg: a platform for modeling, analysis and visualization of biological systems using large-scale experimental data , 2008, Genome Biology.

[18]  Maricel G. Kann,et al.  Protein interactions and disease: computational approaches to uncover the etiology of diseases , 2007, Briefings Bioinform..

[19]  Tu-Bao Ho,et al.  Detecting disease genes based on semi-supervised learning and protein-protein interaction networks , 2012, Artif. Intell. Medicine.

[20]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[21]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[22]  Ting Chen,et al.  Further understanding human disease genes by comparing with housekeeping genes and other genes , 2006, BMC Genomics.

[23]  Pall I. Olason,et al.  A human phenome-interactome network of protein complexes implicated in genetic disorders , 2007, Nature Biotechnology.

[24]  Frances S. Turner,et al.  POCUS: mining genomic sequence annotation to predict disease genes , 2003, Genome Biology.

[25]  Yadong Wang,et al.  miR2Disease: a manually curated database for microRNA deregulation in human disease , 2008, Nucleic Acids Res..

[26]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[27]  W. Filipowicz,et al.  The widespread regulation of microRNA biogenesis, function and decay , 2010, Nature Reviews Genetics.

[28]  Haiyuan Yu,et al.  HINT: High-quality protein interactomes and their applications in understanding human disease , 2012, BMC Systems Biology.

[29]  T. Gilliam,et al.  Molecular triangulation: bridging linkage and molecular-network information for identifying candidate genes in Alzheimer's disease. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[30]  A. Bateman,et al.  Protein interactions in human genetic diseases , 2008, Genome Biology.

[31]  Roded Sharan,et al.  A Network-Based Method for Predicting Disease-Causing Genes , 2009, J. Comput. Biol..

[32]  David J. Porteous,et al.  Speeding disease gene discovery by sequence based candidate prioritization , 2005, BMC Bioinformatics.