From Text to Pathway: Corpus Annotation for Knowledge Acquisition from Biomedical Literature

We present a new direction of research, which deploys Text Mining technologies to construct and maintain data bases organized in the form of pathway, by associating parts of papers with relevant portions of a pathway and vice versa. In order to materialize this scenario, we present two annotated corpora. The first, Event Annotation, identifies the spans of text in which biological events are reported, while the other, Pathway Annotation, associates portions of papers with specific parts in a pathway.

[1]  K. Bretonnel Cohen,et al.  MutationFinder: a high-performance system for extracting point mutation mentions from text , 2007, Bioinform..

[2]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[3]  Fumiyoshi Yamashita,et al.  Automated Extraction of Information from the Literature on Chemical-CYP3A4 Interactions , 2007, J. Chem. Inf. Model..

[4]  K. Cohen,et al.  Overview of BioCreative II gene normalization , 2008, Genome Biology.

[5]  Sankar Ghosh,et al.  Signaling to NF-kappaB. , 2004, Genes & development.

[6]  A. Rector,et al.  Relations in biomedical ontologies , 2005, Genome Biology.

[7]  Robert Stevens,et al.  e-Science and biological pathway semantics , 2007, BMC Bioinformatics.

[8]  Sophia Ananiadou,et al.  Text mining and its potential applications in systems biology. , 2006, Trends in biotechnology.

[9]  Limsoon Wong,et al.  Accomplishments and challenges in literature data mining for biology , 2002, Bioinform..

[10]  William R. Hersh,et al.  A comparative analysis of retrieval features used in the TREC 2006 Genomics Track passage retrieval task , 2007, AMIA.

[11]  Jun'ichi Tsujii,et al.  Syntax Annotation for the GENIA Corpus , 2005, IJCNLP.

[12]  Claire Nédellec,et al.  Learning Language in Logic - Genic Interaction Extraction Challenge , 2005 .

[13]  Gary D. Bader,et al.  Pathguide: a Pathway Resource List , 2005, Nucleic Acids Res..

[14]  Lennart Martens,et al.  Human Proteome Organization Proteomics Standards Initiative: data standardization, a view on developments and policy. , 2007, Molecular & cellular proteomics : MCP.

[15]  Sophia Ananiadou,et al.  Text Mining for Biology And Biomedicine , 2005 .

[16]  Sarah M. Keating,et al.  Evolving a lingua franca and associated software infrastructure for computational systems biology: the Systems Biology Markup Language (SBML) project. , 2004, Systems biology.