CoPub update: CoPub 5.0 a text mining system to answer biological questions

In this article, we present CoPub 5.0, a publicly available text mining system, which uses Medline abstracts to calculate robust statistics for keyword co-occurrences. CoPub was initially developed for the analysis of microarray data, but we broadened the scope by implementing new technology and new thesauri. In CoPub 5.0, we integrated existing CoPub technology with new features, and provided a new advanced interface, which can be used to answer a variety of biological questions. CoPub 5.0 allows searching for keywords of interest and its relations to curated thesauri and provides highlighting and sorting mechanisms, using its statistics, to retrieve the most important abstracts in which the terms co-occur. It also provides a way to search for indirect relations between genes, drugs, pathways and diseases, following an ABC principle, in which A and C have no direct connection but are connected via shared B intermediates. With CoPub 5.0, it is possible to create, annotate and analyze networks using the layout and highlight options of Cytoscape web, allowing for literature based systems biology. Finally, operations of the CoPub 5.0 Web service enable to implement the CoPub technology in bioinformatics workflows. CoPub 5.0 can be accessed through the CoPub portal http://www.copub.org.

[1]  BMC Bioinformatics , 2005 .

[2]  Eckhard Wolf,et al.  Escherichia coli infection induces distinct local and systemic transcriptome responses in the mammary gland , 2010, BMC Genomics.

[3]  D. Larsson,et al.  Transcriptional effects of progesterone receptor antagonist in rat granulosa cells , 2010, Molecular and Cellular Endocrinology.

[4]  Trey Ideker,et al.  Building with a scaffold: emerging strategies for high- to low-level cellular modeling. , 2003, Trends in biotechnology.

[5]  W. Alkema,et al.  Prednisolone-induced differential gene expression in mouse liver carrying wild type or a dimerization-defective glucocorticoid receptor , 2010, BMC Genomics.

[6]  Sophia Ananiadou,et al.  FACTA: a text search engine for finding associated biomedical concepts , 2008, Bioinform..

[7]  Rob Jelier,et al.  CoPub Mapper: mining MEDLINE based on search term co-publication , 2005, BMC Bioinformatics.

[8]  Hao Chen,et al.  Content-rich biological network constructed by mining PubMed abstracts , 2004, BMC Bioinformatics.

[9]  Alexander R. Pico,et al.  WikiPathways: Pathway Editing for the People , 2008, PLoS biology.

[10]  Neil R. Smalheiser,et al.  Arrowsmith two-node search interface: A tutorial on finding meaningful links between two disparate sets of articles in MEDLINE , 2009, Comput. Methods Programs Biomed..

[11]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Maurice Bouwhuis,et al.  CoPub: a literature-based keyword enrichment tool for microarray data analysis , 2008, Nucleic Acids Res..

[13]  T. Ideker,et al.  Modeling cellular machinery through biological network comparison , 2006, Nature Biotechnology.

[14]  Takashi Shimizu,et al.  Actions and interactions of progesterone and estrogen on transcriptome profiles of the bovine endometrium. , 2010, Physiological genomics.

[15]  W. Alkema,et al.  Literature-based compound profiling: application to toxicogenomics. , 2007, Pharmacogenomics.

[16]  Jacob de Vlieg,et al.  Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases , 2010, PLoS Comput. Biol..

[17]  R. Wanke,et al.  Microarray Analysis of Equine Endometrium at Days 8 and 12 of Pregnancy1 , 2010, Biology of reproduction.