Discovering Paradigm Shift Patterns in Biomedical Abstracts: Application to Neurodegenerative Diseases

Millions of facts are stored within the biological literature. Most of these facts represent small advances in the knowledge on an established theory, but a small fraction offer new insight into a biological phenomenon. We propose a method based on computational linguistic tools for distinguishing these facts (extraction) and exposing knowledge that may be important in future developments (prediction). The method is based on finding linguistic cues indicating that the authors of biological articles have identified a problem with, or a break from, conventional knowledge.

[1]  Lorraine K. Tanabe,et al.  Tagging gene and protein names in biomedical text , 2002, Bioinform..

[2]  Marc Moens,et al.  Articles Summarizing Scientific Articles: Experiments with Relevance and Rhetorical Status , 2002, CL.

[3]  Nigel Collier,et al.  Zone Identification in Biology Articles as a Basis for Information Extraction , 2004, NLPBA/BioNLP.

[4]  Carol Friedman,et al.  Automatic extraction of gene and protein synonyms from MEDLINE and journal articles , 2002, AMIA.

[5]  D. Chaussabel,et al.  Mining microarray expression data by literature profiling , 2002, Genome Biology.

[6]  Tommy Nilsson,et al.  The Golgi Apparatus: Balancing New with Old , 2002, Traffic.

[7]  Simone Teufel,et al.  Meta-discourse markers and problem-structuring in scientific articles , 2002 .

[8]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[9]  A. Valencia,et al.  Mining functional information associated with expression arrays , 2001, Functional & Integrative Genomics.

[10]  Susumu Goto,et al.  Analysis of protein interaction networks in neurodegenerative disorders , 2005 .

[11]  L. Barbeito,et al.  The molecular bases of Alzheimer's disease and other neurodegenerative disorders. , 2001, Archives of medical research.

[12]  S. Lipton,et al.  Molecular pathways to neurodegeneration , 2004, Nature Medicine.

[13]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[14]  Russ B. Altman,et al.  A literature-based method for assessing the functional coherence of a gene group , 2003, Bioinform..

[15]  Phuong B Tran,et al.  Aggregates in neurodegenerative disease: crowds and power? , 1999, Trends in Neurosciences.

[16]  Frédérique Lisacek,et al.  Consistency checks for characterizing protein forms , 2003, Comput. Biol. Chem..

[17]  Proux,et al.  Detecting Gene Symbols and Names in Biological Texts: A First Step toward Pertinent Information Extraction. , 1998, Genome informatics. Workshop on Genome Informatics.

[18]  James Allan,et al.  First story detection in TDT is hard , 2000, CIKM '00.

[19]  Jean-Pierre Chanod,et al.  Robustness beyond shallowness: incremental deep parsing , 2002, Natural Language Engineering.