Refine gene functional similarity network based on interaction networks

BackgroundIn recent years, biological interaction networks have become the basis of some essential study and achieved success in many applications. Some typical networks such as protein-protein interaction networks have already been investigated systematically. However, little work has been available for the construction of gene functional similarity networks so far. In this research, we will try to build a high reliable gene functional similarity network to promote its further application.ResultsHere, we propose a novel method to construct and refine the gene functional similarity network. It mainly contains three steps. First, we establish an integrated gene functional similarity networks based on different functional similarity calculation methods. Then, we construct a referenced gene-gene association network based on the protein-protein interaction networks. At last, we refine the spurious edges in the integrated gene functional similarity network with the help of the referenced gene-gene association network. Experiment results indicate that the refined gene functional similarity network (RGFSN) exhibits a scale-free, small world and modular architecture, with its degrees fit best to power law distribution. In addition, we conduct protein complex prediction experiment for human based on RGFSN and achieve an outstanding result, which implies it has high reliability and wide application significance.ConclusionsOur efforts are insightful for constructing and refining gene functional similarity networks, which can be applied to build other high quality biological networks.

[1]  Roded Sharan,et al.  Associating Genes and Protein Complexes with Disease via Network Propagation , 2010, PLoS Comput. Biol..

[2]  Dong Liu,et al.  Inferring plant microRNA functional similarity using a weighted protein-protein interaction network , 2015, BMC Bioinformatics.

[3]  Ralf Herwig,et al.  ConsensusPathDB: toward a more complete picture of cell biology , 2010, Nucleic Acids Res..

[4]  Nicola J. Mulder,et al.  Gene Ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery , 2016, Briefings Bioinform..

[5]  Fidel Ramírez,et al.  Computing topological parameters of biological networks , 2008, Bioinform..

[6]  Jiamou Liu,et al.  How to Build Your Network? A Structural Analysis , 2016, IJCAI.

[7]  Rui Jiang,et al.  Constructing a gene semantic similarity network for the inference of disease genes , 2011, BMC Systems Biology.

[8]  Yang Liu,et al.  Inferring the soybean (Glycine max) microRNA functional network based on target gene network , 2014, Bioinform..

[9]  G. Vriend,et al.  A text-mining analysis of the human phenome , 2006, European Journal of Human Genetics.

[10]  Catia Pesquita,et al.  Evaluating GO-based Semantic Similarity Measures , 2007 .

[11]  Jinyan Li,et al.  Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data , 2012, BMC Genomics.

[12]  Moataz A. Ahmed,et al.  Protein complexes predictions within protein interaction networks using genetic algorithms , 2016, BMC Bioinformatics.

[13]  Jun Wang,et al.  Prioritization of candidate disease genes by topological similarity between disease and protein diffusion profiles , 2013, BMC Bioinformatics.

[14]  Xiaoyan Liu,et al.  Measuring gene functional similarity based on group-wise comparison of GO terms , 2013, Bioinform..

[15]  Piers J. Ingram,et al.  Probability models for degree distributions of protein interaction networks , 2005 .

[16]  Guangyuan Fu,et al.  Predicting Protein Function via Semantic Integration of Multiple Networks , 2016, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[17]  Davide Heller,et al.  STRING v10: protein–protein interaction networks, integrated over the tree of life , 2014, Nucleic Acids Res..

[18]  Yin Zhang,et al.  Constructing an integrated gene similarity network for the identification of disease genes , 2016, Journal of Biomedical Semantics.

[19]  Nazar Zaki,et al.  Protein complex detection using interaction reliability assessment and weighted clustering coefficient , 2013, BMC Bioinformatics.

[20]  Rainer Spang,et al.  Inferring cellular networks – a review , 2007, BMC Bioinformatics.

[21]  Jian Yang,et al.  The integration of weighted human gene association networks based on link prediction , 2017, BMC Systems Biology.

[22]  Philip Resnik,et al.  Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language , 1999, J. Artif. Intell. Res..

[23]  Phillip W. Lord,et al.  Semantic Similarity in Biomedical Ontologies , 2009, PLoS Comput. Biol..

[24]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[25]  Matej Oresic,et al.  Systematic construction of gene coexpression networks with applications to human T helper cell differentiation process , 2007, Bioinform..

[26]  Yi Pan,et al.  Identification of Essential Proteins Based on Edge Clustering Coefficient , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[27]  Rezvan Ehsani,et al.  TopoICSim: a new semantic similarity measure based on gene ontology , 2016, BMC Bioinformatics.

[28]  Karthik Ramani,et al.  Global Geometric Affinity for Revealing High Fidelity Protein Interaction Network , 2011, PloS one.

[29]  Jiajie Peng,et al.  InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology , 2016, BMC Genomics.

[30]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Karthik Ramani,et al.  The Intrinsic Geometric Structure of Protein-Protein Interaction Networks for Protein Interaction Prediction , 2014, TCBB.

[32]  C. Stam,et al.  Small-world networks and functional connectivity in Alzheimer's disease. , 2006, Cerebral cortex.

[33]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[34]  Guojun Liu,et al.  Identify bilayer modules via pseudo-3D clustering: applications to miRNA-gene bilayer networks , 2016, Nucleic acids research.

[35]  Xiaoyan Liu,et al.  SGFSC: speeding the gene functional similarity calculation based on hash tables , 2016, BMC Bioinformatics.

[36]  Jagdish Chandra Patra,et al.  Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network , 2010, Bioinform..

[37]  Raya Khanin,et al.  How Scale-Free Are Biological Networks , 2006, J. Comput. Biol..

[38]  Thomas Lengauer,et al.  Improving disease gene prioritization using the semantic similarity of Gene Ontology terms , 2010, Bioinform..

[39]  Michael A. Langston,et al.  Threshold selection in gene co-expression networks using spectral graph theory techniques , 2009, BMC Bioinformatics.

[40]  Ying Yin,et al.  Inferring human miRNA functional similarity based on gene ontology annotations , 2016, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD).

[41]  De-Shuang Huang,et al.  A Two-Stage Geometric Method for Pruning Unreliable Links in Protein-Protein Networks , 2015, IEEE Transactions on NanoBioscience.

[42]  Tao Zhou,et al.  Link prediction in weighted networks: The role of weak ties , 2010 .

[43]  Dong Wang,et al.  Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases , 2010, Bioinform..

[44]  Mario Albrecht,et al.  Recent approaches to the prioritization of candidate disease genes , 2012, Wiley interdisciplinary reviews. Systems biology and medicine.

[45]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[46]  Igor Jurisica,et al.  Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[47]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[48]  Chun-Yu Wang,et al.  CPL: Detecting Protein Complexes by Propagating Labels on Protein-Protein Interaction Network , 2014, Journal of Computer Science and Technology.

[49]  Jing Zhao,et al.  Prediction of Links and Weights in Networks by Reliable Routes , 2015, Scientific Reports.

[50]  唐翌,et al.  Link prediction based on a semi-local similarity index , 2011 .

[51]  Masanori Arita,et al.  Scale-freeness and biological networks. , 2005, Journal of biochemistry.

[52]  Xiaoyan Liu,et al.  An improved method for functional similarity analysis of genes based on Gene Ontology , 2016, BMC Systems Biology.

[53]  Philip S. Yu,et al.  A new method to measure the semantic similarity of GO terms , 2007, Bioinform..

[54]  Xiangxiang Zeng,et al.  Prediction and Validation of Disease Genes Using HeteSim Scores , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.