Functional Link Artificial Neural Network-based disease gene prediction

Genes that contribute to complex traits pose special challenges that make candidate disease-associated gene discovery more difficult. In this work, we investigated topological features derived from PPI network to identify the causing genes of four complex diseases: Cancer, Type 1 Diabetes, Type 2 Diabetes, and Ageing genes. We used 10-fold cross-validation to evaluate the predictive capacity of all possible combinations of these features and found the features with the best predictive ability. We assessed the performance of Multi-layer Perceptron (MLP), Functional Link Artificial Neural Network (FLANN), and Support Vector Machines (SVM). We found that SVM provides higher accuracy than MLP and FLANN. However, the FLANN has significantly low computation time while its accuracy is comparable to that of SVM and MLP.

[1]  Frances S. Turner,et al.  POCUS: mining genomic sequence annotation to predict disease genes , 2003, Genome Biology.

[2]  J. Nadeau,et al.  Finding Genes That Underlie Complex Traits , 2002, Science.

[3]  Jagdish Chandra Patra,et al.  Selection of features from protein-protein interaction network for identifying cancer genes , 2008, 2008 IEEE International Conference on Systems, Man and Cybernetics.

[4]  João Pedro de Magalhães,et al.  HAGR: the Human Ageing Genomic Resources , 2004, Nucleic Acids Res..

[5]  P. Bork,et al.  Association of genes to genetically inherited diseases using data mining , 2002, Nature Genetics.

[6]  Jurg Ott,et al.  Genetic dissection of diseases: design and methods. , 2004, Current opinion in genetics & development.

[7]  Ting Chen,et al.  Further understanding human disease genes by comparing with housekeeping genes and other genes , 2006, BMC Genomics.

[8]  Z. Hall Cancer , 1906, The Hospital.

[9]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[11]  T. Hubbard,et al.  A census of human cancer genes , 2004, Nature Reviews Cancer.

[12]  B. Snel,et al.  Predicting disease genes using protein–protein interactions , 2006, Journal of Medical Genetics.

[13]  A. Barabasi,et al.  A Protein–Protein Interaction Network for Human Inherited Ataxias and Disorders of Purkinje Cell Degeneration , 2006, Cell.

[14]  Jan Freudenberg,et al.  A similarity-based method for genome-wide prediction of disease-relevant human genes , 2002, ECCB.

[15]  Ganapati Panda,et al.  Nonlinear channel equalization for QAM signal constellation using artificial neural networks , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  James E. Allen,et al.  T1DBase: integration and presentation of complex data for type 1 diabetes research , 2006, Nucleic Acids Res..

[18]  K. N. Chandrika,et al.  Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets , 2006, Nature Genetics.

[19]  Yongjin Li,et al.  Discovering disease-genes by topological features in human protein-protein interaction network , 2006, Bioinform..

[20]  Cathy H. Wu,et al.  InterPro, progress and status in 2005 , 2004, Nucleic Acids Res..

[21]  Yoh-Han Pao,et al.  Adaptive pattern recognition and neural networks , 1989 .

[22]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .

[23]  James E. Allen,et al.  T1DBase: integration and presentation of complex data for type 1 diabetes research , 2007, Nucleic Acids Research.

[24]  R. Myers,et al.  Candidate-gene approaches for studying complex genetic traits: practical considerations , 2002, Nature Reviews Genetics.