PIMiner: a web tool for extraction of protein interactions from biomedical literature

Information on Protein Interactions (Pls) is valuable for biomedical research, but often lies buried in the scientific literature and cannot be readily retrieved. While much progress has been made over the years in extracting Pls from the literature using computational methods, there is a lack of free, public, user-friendly tools for the discovery of Pls. We developed an online tool for the extraction of PI relationships from PubMed-abstracts, which we name PIMiner. Protein pairs and the words that describe their interactions are reported by PIMiner so that new interactions can be easily detected within text. The interaction likelihood levels are reported too. The option to extract only specific types of interactions is also provided. The PIMiner server can be accessed through a web browser or remotely through a client's command line. PIMiner can process 50,000 PubMed abstracts in approximately 7 min and thus appears suitable for large-scale processing of biological/biomedical literature.

[1]  Alfonso Valencia,et al.  Overview of BioCreAtIvE: critical assessment of information extraction for biology , 2005, BMC Bioinformatics.

[2]  Adrian J. Shepherd,et al.  A realistic assessment of methods for extracting gene/protein interactions from free text , 2009, BMC Bioinformatics.

[3]  Alfonso Valencia,et al.  Implementing the iHOP concept for navigation of biomedical literature , 2005, ECCB/JBI.

[4]  Jun'ichi Tsujii,et al.  Syntactic Features for Protein-Protein Interaction Extraction , 2007, LBM.

[5]  Dietrich Rebholz-Schuhmann,et al.  Data and text mining Text processing through Web services : calling Whatizit , 2008 .

[6]  Qiwen Dong,et al.  Prediction of protein - protein interactions from primary sequences , 2010, Int. J. Data Min. Bioinform..

[7]  Jianwen Fang,et al.  Large-scale Protein-Protein Interaction prediction using novel kernel methods , 2008, Int. J. Data Min. Bioinform..

[8]  María Martín,et al.  The Universal Protein Resource (UniProt) in 2010 , 2010 .

[9]  V. Bajic,et al.  Simplified Method to Predict Mutual Interactions of Human Transcription Factors Based on Their Primary Structure , 2011, PloS one.

[10]  Ulf Leser,et al.  A Comprehensive Benchmark of Kernel Methods to Extract Protein–Protein Interactions from Literature , 2010, PLoS Comput. Biol..

[11]  Ralf Zimmer,et al.  RelEx - Relation extraction using dependency parse trees , 2007, Bioinform..

[12]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[13]  Minlie Huang,et al.  Mining physical protein-protein interactions from the literature , 2008, Genome Biology.

[14]  Xiaofeng Wang,et al.  Mining hidden connections among biomedical concepts from disjoint biomedical literature sets through semantic-based association rule , 2010 .

[15]  Baris E. Suzek,et al.  The Universal Protein Resource (UniProt) in 2010 , 2009, Nucleic Acids Res..

[16]  Zhiyong Lu,et al.  OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression , 2008, BMC Bioinformatics.

[17]  Jinfeng Zhang,et al.  Bayesian inference of protein-protein interactions from biological literature , 2009, Bioinform..

[18]  Hongfang Liu,et al.  BioThesaurus: a web-based thesaurus of protein and gene names , 2006, Bioinform..

[19]  Razvan C. Bunescu,et al.  Subsequence Kernels for Relation Extraction , 2005, NIPS.

[20]  Graciela Gonzalez,et al.  BANNER: An Executable Survey of Advances in Biomedical Named Entity Recognition , 2007, Pacific Symposium on Biocomputing.

[21]  Sophia Ananiadou,et al.  Developing a Robust Part-of-Speech Tagger for Biomedical Text , 2005, Panhellenic Conference on Informatics.

[22]  See-Kiong Ng,et al.  Improving domain-based protein interaction prediction using biologically-significant negative dataset , 2006, Int. J. Data Min. Bioinform..

[23]  Cory B. Giles,et al.  Large-scale directional relationship extraction and resolution , 2008, BMC Bioinformatics.

[24]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[25]  Hena Jose,et al.  Extraction of Protein Interaction Data: A Comparative Analysis of Methods in Use , 2007, EURASIP J. Bioinform. Syst. Biol..