Multiple kernels learning-based biological entity relationship extraction method

BackgroundAutomatic extracting protein entity interaction information from biomedical literature can help to build protein relation network and design new drugs. There are more than 20 million literature abstracts included in MEDLINE, which is the most authoritative textual database in the field of biomedicine, and follow an exponential growth over time. This frantic expansion of the biomedical literature can often be difficult to absorb or manually analyze. Thus efficient and automated search engines are necessary to efficiently explore the biomedical literature using text mining techniques.ResultsThe P, R, and F value of tag graph method in Aimed corpus are 50.82, 69.76, and 58.61%, respectively. The P, R, and F value of tag graph kernel method in other four evaluation corpuses are 2–5% higher than that of all-paths graph kernel. And The P, R and F value of feature kernel and tag graph kernel fuse methods is 53.43, 71.62 and 61.30%, respectively. The P, R and F value of feature kernel and tag graph kernel fuse methods is 55.47, 70.29 and 60.37%, respectively. It indicated that the performance of the two kinds of kernel fusion methods is better than that of simple kernel.ConclusionIn comparison with the all-paths graph kernel method, the tag graph kernel method is superior in terms of overall performance. Experiments show that the performance of the multi-kernels method is better than that of the three separate single-kernel method and the dual-mutually fused kernel method used hereof in five corpus sets.

[1]  Jari Björne,et al.  Comparative analysis of five protein-protein interaction corpora , 2008, BMC Bioinformatics.

[2]  Tatsuya Akutsu,et al.  Prediction of Protein-Protein Interaction Strength Using Domain Features with Supervised Regression , 2014, TheScientificWorldJournal.

[3]  Xuequn Shang,et al.  Predicting disease-related genes using integrated biomedical networks , 2017, BMC Genomics.

[4]  Alfonso Valencia,et al.  The Functional Genomics Network in the evolution of biological text mining over the past decade. , 2013, New biotechnology.

[5]  Michael Gamon,et al.  MSR-NLP Entry in BioNLP Shared Task 2011 , 2011, BioNLP@ACL.

[6]  Peter M. A. Sloot,et al.  A hybrid approach to extract protein-protein interactions , 2011, Bioinform..

[7]  Xiaoming Liu,et al.  Prediction of hot regions in protein-protein interaction by combining density-based incremental clustering with feature-based classification , 2015, Comput. Biol. Medicine.

[8]  Deyu Zhou,et al.  Biomedical Relation Extraction: From Binary to Complex , 2014, Comput. Math. Methods Medicine.

[9]  Yang Zhang,et al.  Template-based structure modeling of protein-protein interactions. , 2014, Current opinion in structural biology.

[10]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Andreas Vlachos,et al.  Biomedical event extraction from abstracts and full papers using search-based structured prediction , 2011, BMC Bioinformatics.

[12]  Tatsuya Akutsu,et al.  Prediction of Heterodimeric Protein Complexes from Weighted Protein-Protein Interaction Networks Using Novel Features and Kernel Functions , 2013, PloS one.

[13]  Yadong Wang,et al.  Constructing Networks of Organelle Functional Modules in Arabidopsis , 2016, Current genomics.

[14]  Dietrich Rebholz-Schuhmann,et al.  Improving the extraction of complex regulatory events from scientific text by using ontology-based inference , 2011, Semantic Mining in Biomedicine.

[15]  Cathy H. Wu,et al.  RLIMS-P 2.0: A Generalizable Rule-Based Information Extraction System for Literature Mining of Protein Phosphorylation Information , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Zhenchao Jiang,et al.  Integrating Semantic Information into Multiple Kernels for Protein-Protein Interaction Extraction from Biomedical Literatures , 2014, PloS one.

[17]  Bing Li,et al.  Metagenomic and network analysis reveal wide distribution and co-occurrence of environmental antibiotic resistance genes , 2015, The ISME Journal.

[18]  K. Vijay-Shanker,et al.  RankPref: Ranking Sentences Describing Relations between Biomedical Entities with an Application , 2012, BioNLP@HLT-NAACL.

[19]  K. Bretonnel Cohen,et al.  HIGH‐PRECISION BIOLOGICAL EVENT EXTRACTION: EFFECTS OF SYSTEM AND OF DATA , 2011, Comput. Intell..

[20]  Brendan J. Loftus,et al.  Functional characterization of the Mycobacterium abscessus genome coupled with condition specific transcriptomics reveals conserved molecular strategies for host adaptation and persistence , 2016, BMC Genomics.

[21]  Alberto Lavelli,et al.  Combining Tree Structures, Flat Features and Patterns for Biomedical Relation Extraction , 2012, EACL.

[22]  Yoshinobu Kano,et al.  Extracting Protein Interactions from Text with the Unified AkaneRE Event Extraction System , 2010, TCBB.

[23]  Senay Kafkas,et al.  Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles , 2015, J. Biomed. Semant..

[24]  Andrew McCallum,et al.  Model Combination for Event Extraction in BioNLP 2011 , 2011, BioNLP@ACL.

[25]  Sophia Ananiadou,et al.  Event-based text mining for biology and functional genomics , 2014, Briefings in functional genomics.

[26]  Jiajie Peng,et al.  InteGO2: a web tool for measuring and visualizing gene semantic similarities using Gene Ontology , 2016, BMC Genomics.

[27]  Christopher B. Jones,et al.  KneeTex: an ontology–driven system for information extraction from MRI reports , 2015, Journal of Biomedical Semantics.

[28]  Jong C. Park,et al.  Augmenting Biological Text Mining with Symbolic Inference , 2013 .

[29]  Guohua Wang,et al.  SIDD: A Semantically Integrated Database towards a Global View of Human Disease , 2013, PloS one.

[30]  Halil Kilicoglu,et al.  Adapting a General Semantic Interpretation Approach to Biological Event Extraction , 2011, BioNLP@ACL.

[31]  Nguyen Ha Vo,et al.  Efficient Extraction of Protein-Protein Interactions from Full-Text Articles , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[32]  Ioannis Iliopoulos,et al.  Biological information extraction and co-occurrence analysis. , 2014, Methods in molecular biology.

[33]  Mihai Surdeanu,et al.  SnapToGrid: From Statistical to Interpretable Models for Biomedical Information Extraction , 2016, BioNLP@ACL.

[34]  Yifan Peng,et al.  A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems , 2014, BMC Bioinformatics.

[35]  Cathy H. Wu,et al.  Text Mining of Protein Phosphorylation Information Using a Generalizable Rule-Based Approach , 2013, BCB.

[36]  Yadong Wang,et al.  Extending gene ontology with gene association networks , 2016, Bioinform..

[37]  Jari Björne,et al.  Generalizing Biomedical Event Extraction , 2011, BioNLP@ACL.

[38]  Tu-Bao Ho,et al.  Detecting disease genes based on semi-supervised learning and protein-protein interaction networks , 2012, Artif. Intell. Medicine.

[39]  Yadong Wang,et al.  Statistical Approaches for the Construction and Interpretation of Human Protein-Protein Interaction Network , 2016, BioMed research international.

[40]  Yunlong Liu,et al.  Identification of genes and pathways involved in kidney renal clear cell carcinoma , 2014, BMC Bioinformatics.

[41]  Zhenchao Jiang,et al.  An approach to improve kernel-based Protein-Protein Interaction extraction by learning from large-scale network data. , 2015, Methods.

[42]  Jonathan Qiang Jiang,et al.  Predicting Protein Function by Multi-Label Correlated Semi-Supervised Learning , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[43]  Yadong Wang,et al.  Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology , 2016, BioMed research international.

[44]  Andrew K. C. Wong,et al.  Predicting Protein-protein interaction using co-occurring Aligned Pattern Clusters , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[45]  Olaf Wolkenhauer,et al.  Annotation-based feature extraction from sets of SBML models , 2015, Journal of biomedical semantics.

[46]  Yue Jiang,et al.  DisSim: an online system for exploring significant similar diseases and exhibiting potential therapeutic drugs , 2016, Scientific Reports.

[47]  Daniel W. A. Buchan,et al.  Scalable web services for the PSIPRED Protein Analysis Workbench , 2013, Nucleic Acids Res..