Protein interaction network constructing based on text mining and reinforcement learning with application to prostate cancer.

Constructing interaction network from biomedical texts is a very important and interesting work. The authors take advantage of text mining and reinforcement learning approaches to establish protein interaction network. Considering the high computational efficiency of co-occurrence-based interaction extraction approaches and high precision of linguistic patterns approaches, the authors propose an interaction extracting algorithm where they utilise frequently used linguistic patterns to extract the interactions from texts and then find out interactions from extended unprocessed texts under the basic idea of co-occurrence approach, meanwhile they discount the interaction extracted from extended texts. They put forward a reinforcement learning-based algorithm to establish a protein interaction network, where nodes represent proteins and edges denote interactions. During the evolutionary process, a node selects another node and the attained reward determines which predicted interaction should be reinforced. The topology of the network is updated by the agent until an optimal network is formed. They used texts downloaded from PubMed to construct a prostate cancer protein interaction network by the proposed methods. The results show that their method brought out pretty good matching rate. Network topology analysis results also demonstrate that the curves of node degree distribution, node degree probability and probability distribution of constructed network accord with those of the scale-free network well.

[1]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[2]  Xiaohua Hu,et al.  MAPLSC: A novel multi-class classifier for medical diagnosis , 2011, Int. J. Data Min. Bioinform..

[3]  Y. Zhang,et al.  IntAct—open source resource for molecular interaction data , 2006, Nucleic Acids Res..

[4]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[5]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[6]  Cory B. Giles,et al.  Large-scale directional relationship extraction and resolution , 2008, BMC Bioinformatics.

[7]  Anna Korhonen,et al.  The first step in the development of text mining technology for cancer risk assessment: identifying and organizing scientific evidence in risk assessment literature , 2009, BMC Bioinformatics.

[8]  Xingming Zhao,et al.  A survey on computational approaches to identifying disease biomarkers based on molecular networks. , 2014, Journal of theoretical biology.

[9]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[10]  Mei Liu,et al.  Assessing reliability of protein-protein interactions by integrative analysis of data in model organisms , 2009, BMC Bioinformatics.

[11]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[12]  Vipul Kashyap,et al.  The Translational Medicine Ontology and Knowledge Base: driving personalized medicine by bridging the gap between bench and bedside , 2011, J. Biomed. Semant..

[13]  A. Jemal,et al.  Global Cancer Statistics , 2011 .

[14]  Wen-Lian Hsu,et al.  New Challenges for Biological Text-Mining in the Next Decade , 2010, Journal of Computer Science and Technology.

[15]  Pierre Zweigenbaum,et al.  Automatic extraction of semantic relations between medical entities: a rule based approach , 2011, J. Biomed. Semant..

[16]  F. Bruggeman,et al.  Cancer: a Systems Biology disease. , 2006, Bio Systems.

[17]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[18]  Tin Wee Tan,et al.  Rule-based knowledge aggregation for large-scale protein sequence analysis of influenza A viruses , 2008, BMC Bioinformatics.

[19]  D. Sornette,et al.  The US Stock Market Leads the Federal Funds Rate and Treasury Bond Yields , 2011, PloS one.

[20]  Teruyoshi Hishiki,et al.  Automatic recognition of topic-classified relations between prostate cancer and genes using MEDLINE abstracts , 2006, BMC Bioinformatics.

[21]  Hamid R. Tizhoosh,et al.  A Reinforcement Learning Framework for Medical Image Segmentation , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[22]  A. Wagner The yeast protein interaction network evolves rapidly and contains few redundant duplicate genes. , 2001, Molecular biology and evolution.

[23]  S. Hayasaka,et al.  A Network of Genes, Genetic Disorders, and Brain Areas , 2011, PloS one.

[24]  Alison Watts,et al.  A Dynamic Model of Network Formation , 2001, Games Econ. Behav..

[25]  Michael Q. Zhang,et al.  Network-based global inference of human disease genes , 2008, Molecular systems biology.

[26]  Bairong Shen,et al.  Combined SVM-CRFs for Biological Named Entity Recognition with Maximal Bidirectional Squeezing , 2012, PloS one.

[27]  A. Vinayagam,et al.  A Directed Protein Interaction Network for Investigating Intracellular Signal Transduction , 2011, Science Signaling.

[28]  Albert-László Barabási,et al.  Scale-Free Networks: A Decade and Beyond , 2009, Science.

[29]  Cheng Zhang,et al.  Biomedical text mining and its applications in cancer research , 2013, J. Biomed. Informatics.

[30]  Nick C Fox,et al.  Gene-Wide Analysis Detects Two New Susceptibility Genes for Alzheimer's Disease , 2014, PLoS ONE.

[31]  Maria Liakata,et al.  A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment , 2011, BMC Bioinformatics.

[32]  Quan Liu,et al.  Segmentation of Neuronal Structures Using SARSA (λ)-Based Boundary Amendment with Reinforced Gradient-Descent Curve Shape Fitting , 2014, PloS one.

[33]  Amitabh Sharma,et al.  Lipids in Health and Disease , 2006 .

[34]  Christian von Mering,et al.  STRING 8—a global view on proteins and their functional interactions in 630 organisms , 2008, Nucleic Acids Res..

[35]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[36]  Andreas Wilke,et al.  Functional analysis of metagenomes and metatranscriptomes using SEED and KEGG , 2011, BMC Bioinformatics.

[37]  Ricky W Johnstone,et al.  AKT Promotes rRNA Synthesis and Cooperates with c-MYC to Stimulate Ribosome Biogenesis in Cancer , 2011, Science Signaling.

[38]  R. Durrett Random Graph Dynamics: References , 2006 .

[39]  Hilbert J. Kappen,et al.  The Cluster Variation Method for Efficient Linkage Analysis on Extended Pedigrees , 2006, BMC Bioinformatics.