GenCLiP 3: mining human genes' functions and regulatory networks from PubMed based on co-occurrences and natural language processing

SUMMARY We present a web server, GenCLiP 3, which is an updated version of GenCLiP 2.0 to enhance analysis of human gene functions and regulatory networks, with the following improvements: i) accurate recognition of molecular interactions with polarity and directionality from the entire PubMed database; ii) support for Boolean search to customize multiple-term search and to quickly retrieve function related genes; iii) strengthened association between gene and keyword by a new scoring method; and iv) daily updates following literature release at PubMed FTP. AVAILABILITY The server is freely available for academic use at: http://ci.smu.edu.cn/genclip3/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Jaehoon Choi,et al.  BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature , 2016, PloS one.

[2]  Damian Szklarczyk,et al.  The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible , 2016, Nucleic Acids Res..

[3]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[4]  Kara Dolinski,et al.  The BioGRID interaction database: 2017 update , 2016, Nucleic Acids Res..

[5]  Gang Fu,et al.  Disease Ontology 2015 update: an expanded and updated database of human diseases for linking biomedical knowledge through disease data , 2014, Nucleic Acids Res..

[6]  Cathy H. Wu,et al.  DEXTER: Disease-Expression Relation Extraction from Text , 2018, Database J. Biol. Databases Curation.

[7]  Hai Zhang,et al.  GenCLiP 2.0: a web server for functional clustering of genes and construction of molecular networks based on free terms , 2014, Bioinform..

[8]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[9]  Josep F. Abril,et al.  PPaxe: easy extraction of protein occurrence and interactions from the scientific literature , 2018, Bioinform..

[10]  Wei Xu,et al.  The disease and gene annotations (DGA): an annotation resource for human disease , 2012, Nucleic Acids Res..

[11]  Ralf Herwig,et al.  The ConsensusPathDB interaction database: 2013 update , 2012, Nucleic Acids Res..

[12]  Hans-Werner Mewes,et al.  CORUM: the comprehensive resource of mammalian protein complexes , 2007, Nucleic Acids Res..

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  David S. Wishart,et al.  PolySearch2: a significantly improved text-mining system for discovering associations between human diseases, genes, drugs, metabolites, toxins and more , 2015, Nucleic Acids Res..

[15]  Jeffrey A. Wiser,et al.  Immune-centric network of cytokines and cells in disease context identified by computational mining of PubMed , 2018, Nature Biotechnology.

[16]  Feng Li,et al.  CellMarker: a manually curated resource of cell markers in human and mouse , 2018, Nucleic Acids Res..