A hybrid network-based method for the detection of disease-related genes

Abstract Detecting disease-related genes is crucial in disease diagnosis and drug design. The accepted view is that neighbors of a disease-causing gene in a molecular network tend to cause the same or similar diseases, and network-based methods have been recently developed to identify novel hereditary disease-genes in available biomedical networks. Despite the steady increase in the discovery of disease-associated genes, there is still a large fraction of disease genes that remains under the tip of the iceberg. In this paper we exploit the topological properties of the protein–protein interaction (PPI) network to detect disease-related genes. We compute, analyze, and compare the topological properties of disease genes with non-disease genes in PPI networks. We also design an improved random forest classifier based on these network topological features, and a cross-validation test confirms that our method performs better than previous similar studies.

[1]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[2]  A. del Sol,et al.  Prediction of disease–gene–drug relationships following a differential network analysis , 2016, Cell Death and Disease.

[3]  Reinhard Diestel,et al.  Graph Theory , 1997 .

[4]  Jagdish Chandra Patra,et al.  Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network , 2010, Bioinform..

[5]  Tao Jiang,et al.  Uncover disease genes by maximizing information flow in the phenome–interactome network , 2011, Bioinform..

[6]  Yi Sui,et al.  Analysis of human genes with protein–protein interaction network for detecting disease genes , 2014 .

[7]  A. Vespignani,et al.  The architecture of complex weighted networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  John O. Woods,et al.  Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses , 2013, PloS one.

[9]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[10]  H. Brunner,et al.  From syndrome families to functional genomics , 2004, Nature Reviews Genetics.

[11]  Jing Chen,et al.  Disease candidate gene identification and prioritization using protein interaction networks , 2009, BMC Bioinformatics.

[12]  Mark Gerstein,et al.  Interpretation of Genomic Variants Using a Unified Biological Network Approach , 2013, PLoS Comput. Biol..

[13]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[14]  Pradipta Maji,et al.  RelSim: An integrated method to identify disease genes using gene expression profiles and PPIN based similarity measure , 2017, Inf. Sci..

[15]  Dayu Xiao,et al.  A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization , 2016, PloS one.

[16]  Yongjin Li,et al.  Discovering disease-genes by topological features in human protein-protein interaction network , 2006, Bioinform..

[17]  K. Hornik,et al.  Unbiased Recursive Partitioning: A Conditional Inference Framework , 2006 .

[18]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[19]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[20]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[21]  V. McKusick Mendelian Inheritance in Man and Its Online Version, OMIM , 2007, The American Journal of Human Genetics.

[22]  K. Aldape,et al.  Identification of Causal Genetic Drivers of Human Disease through Systems-Level Analysis of Regulatory Networks , 2016, Cell.

[23]  João Pedro de Magalhães,et al.  Gene co-expression analysis for functional classification and gene–disease predictions , 2017, Briefings Bioinform..

[24]  S. Lewis,et al.  Use of Model Organism and Disease Databases to Support Matchmaking for Human Disease Gene Discovery , 2015, Human mutation.

[25]  D. Vitkup,et al.  Network properties of genes harboring inherited disease mutations , 2008, Proceedings of the National Academy of Sciences.

[26]  Sergey Brin,et al.  Reprint of: The anatomy of a large-scale hypertextual web search engine , 2012, Comput. Networks.

[27]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[28]  Carl Kingsford,et al.  The power of protein interaction networks for associating genes with diseases , 2010, Bioinform..

[29]  L. Freeman,et al.  Centrality in valued graphs: A measure of betweenness based on network flow , 1991 .

[30]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[31]  Kumaran Kandasamy,et al.  An evaluation of human protein-protein interaction data in the public domain , 2006, BMC Bioinformatics.

[32]  Chi Xie,et al.  Predicting China’s SME Credit Risk in Supply Chain Financing by Logistic Regression, Artificial Neural Network and Hybrid Models , 2016 .

[33]  G. Fuellen,et al.  FocusHeuristics – expression-data-driven network optimization and disease gene prediction , 2017, Scientific Reports.

[34]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[35]  Wei Wang,et al.  Unification of theoretical approaches for epidemic spreading on complex networks , 2016, Reports on progress in physics. Physical Society.

[36]  Chi Xie,et al.  Predicting China's SME Credit Risk in Supply Chain Finance Based on Machine Learning Methods , 2016, Entropy.

[37]  M. Oti,et al.  The modular nature of genetic diseases , 2006, Clinical genetics.

[38]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[39]  Haiyuan Yu,et al.  Network-based methods for human disease gene prediction. , 2011, Briefings in functional genomics.