Natural Language Processing and Systems Biology

This chapter outlines the basic families of applications of natural language processing techniques to questions of interest to systems biologists and describes publicly available resources for such applications.

[1]  J. Davenport Editor , 1960 .

[2]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[3]  Editors , 1986, Brain Research Bulletin.

[4]  Christopher K. Riesbeck From Conceptual Analyzer to Direct Memory Access Parsing: An Overview , 1986 .

[5]  Roger C. Schank,et al.  SCRIPTS, PLANS, GOALS, AND UNDERSTANDING , 1988 .

[6]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[7]  A T McCray,et al.  Extending a natural language parser with UMLS knowledge. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[8]  Computers and Human Language , 1991 .

[9]  Charles Eugene Martin,et al.  Direct memory access parsing , 1992 .

[10]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[11]  Allen C. Browne,et al.  UMLS knowledge for biomedical language processing. , 1993, Bulletin of the Medical Library Association.

[12]  R. Falk,et al.  MORGAN's hypothesis of the genetic control of development. , 1993, Genetics.

[13]  David B. Searls,et al.  The computational linguistics of biological sequences , 1993, ISMB 1995.

[14]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[15]  Ellen Riloff,et al.  Automatically Constructing a Dictionary for Information Extraction Tasks , 1993, AAAI.

[16]  T C Rindflesch,et al.  Ambiguity resolution while mapping free text to the UMLS Metathesaurus. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.

[17]  Will Fitzgerald Building embedded conceptual parsers , 1994 .

[18]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[19]  Thomas C. Rindflesch,et al.  Query Expansion Using the UMLS ® Metathesaurus ® , 1997 .

[20]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[21]  Paul Hoffman Perl for Dummies , 1997 .

[22]  T. Takagi,et al.  Toward information extraction: identifying protein names from biological papers. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[23]  Allen C. Browne,et al.  Evaluating lexical variant generation to improve information retrieval , 1998, AMIA.

[24]  Mark Craven,et al.  Constructing Biological Knowledge Bases by Extracting Information from Text Sources , 1999, ISMB.

[25]  David A. Campbell,et al.  A technique for semantic classification of unknown words using UMLS resources , 1999, AMIA.

[26]  Carol Friedman,et al.  Representing genomic knowledge in the UMLS semantic network , 1999, AMIA.

[27]  Miguel A. Andrade-Navarro,et al.  Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions , 1999, ISMB.

[28]  Thomas C. Rindflesch,et al.  EDGAR: extraction of drugs, genes and relations from the biomedical literature. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[29]  Hagit Shatkay,et al.  Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis , 2000, ISMB.

[30]  William R. Hersh,et al.  Assessing thesaurus-based query expansion using the UMLS Metathesaurus , 2000, AMIA.

[31]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[32]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[33]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[34]  Gregory R. Grant,et al.  Bioinformatics - The Machine Learning Approach , 2000, Comput. Chem..

[35]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[36]  G. Wertheim,et al.  Cloning and characterization of Hunk, a novel mammalian SNF1-related protein kinase. , 2000, Genomics.

[37]  Toshihisa Takagi,et al.  Automated extraction of information on protein-protein interactions from the biological literature , 2001, Bioinform..

[38]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[39]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001 .

[40]  Jun'ichi Tsujii,et al.  Event Extraction from Biomedical Papers Using a Full Parser , 2000, Pacific Symposium on Biocomputing.

[41]  Olivier Bodenreider,et al.  Evaluating UMLS strings for natural language processing , 2001, AMIA.

[42]  T. Jenssen,et al.  A literature network of human genes for high-throughput analysis of gene expression , 2001, Nature Genetics.

[43]  J. Blake,et al.  Creating the Gene Ontology Resource : Design and Implementation The Gene Ontology Consortium 2 , 2001 .

[44]  Olivier Bodenreider,et al.  The lexical properties of the gene ontology , 2002, AMIA.

[45]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[46]  Lorraine K. Tanabe,et al.  Tagging gene and protein names in full text articles , 2002, ACL Workshop on Natural Language Processing in the Biomedical Domain.

[47]  Y. Rao,et al.  Bifocal Is a Downstream Target of the Ste20-like Serine/Threonine Kinase Misshapen in Regulating Photoreceptor Growth Cone Targeting in Drosophila , 2002, Neuron.

[48]  William H. Majoros,et al.  Genomics and natural language processing , 2002, Nature Reviews Genetics.

[49]  Jeffrey T. Chang,et al.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature. , 2002, Genome research.

[50]  Lucy T. Nowell,et al.  ThemeRiver: Visualizing Thematic Changes in Large Document Collections , 2002, IEEE Trans. Vis. Comput. Graph..

[51]  Alexander A. Morgan,et al.  Rutabaga by any other name: extracting biological names , 2002, J. Biomed. Informatics.

[52]  Alfonso Valencia,et al.  Information extraction in molecular biology , 2002, Briefings Bioinform..

[53]  D. Searls,et al.  Robots in invertebrate neuroscience , 2002, Nature.

[54]  Fredrik Olsson,et al.  Protein names and how to find them , 2002, Int. J. Medical Informatics.

[55]  Hsinchun Chen,et al.  Filling Preposition-Based Templates to Capture Information from Medical Abstracts , 2001, Pacific Symposium on Biocomputing.

[56]  K. Bretonnel Cohen,et al.  Contrast and variability in gene names , 2002, ACL Workshop on Natural Language Processing in the Biomedical Domain.

[57]  Daniel Berleant,et al.  Mining MEDLINE: Abstracts, Sentences, or Phrases? , 2001, Pacific Symposium on Biocomputing.

[58]  Russ B. Altman,et al.  Research Paper: Creating an Online Dictionary of Abbreviations from MEDLINE , 2002, J. Am. Medical Informatics Assoc..

[59]  Sue Povey,et al.  Genew: the Human Gene Nomenclature Database , 2002, Nucleic Acids Res..

[60]  Alfonso Valencia,et al.  The Frame-Based Module of the SUISEKI Information Extraction System , 2002, IEEE Intell. Syst..

[61]  George Hripcsak,et al.  The sublanguage of cross-coverage , 2002, AMIA.

[62]  Lorraine K. Tanabe,et al.  Tagging gene and protein names in biomedical text , 2002, Bioinform..

[63]  Hagit Shatkay,et al.  Mining the Biomedical Literature in the Genomic Era: An Overview , 2003, J. Comput. Biol..

[64]  Russ B. Altman,et al.  Using machine learning to extract drug and gene relationships from text , 2003 .

[65]  Marti A. Hearst,et al.  A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text , 2002, Pacific Symposium on Biocomputing.

[66]  Mark Craven,et al.  Hierarchical Hidden Markov Models for Information Extraction , 2003, IJCAI.

[67]  T. Dschietzig,et al.  Relaxin, a Pregnancy Hormone, Is a Functional Endothelin-1 Antagonist: Attenuation of Endothelin-1–Mediated Vasoconstriction by Stimulation of Endothelin Type-B Receptor Expression via ERK-1/2 and Nuclear Factor-&kgr;B , 2003, Circulation research.

[68]  William R. Hersh,et al.  TREC GENOMICS Track Overview , 2003, TREC.

[69]  Mark Craven,et al.  Evidence combination in biomedical natural-language processing , 2003, BIOKDD.

[70]  K. E. Ravikumar,et al.  A Biological Named Entity Recognizer , 2002, Pacific Symposium on Biocomputing.

[71]  Alexander A. Morgan,et al.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup , 2003, ISMB.

[72]  Alexander A. Morgan,et al.  Gene Name Extraction Using FlyBase Resources , 2003, BioNLP@ACL.

[73]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[74]  Hsinchun Chen,et al.  A shallow parser based on closed-class words to capture relations in biomedical text , 2003, J. Biomed. Informatics.

[75]  Russ B. Altman,et al.  GAPSCORE: finding gene and protein names one word at a time , 2004, Bioinform..

[76]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[77]  K. Bretonnel Cohen,et al.  A Resource for Constructing Customized Test Suites for Molecular Biology Entity Identification Systems , 2004 .

[78]  Russ B. Altman,et al.  Tools for loading MEDLINE into a local relational database , 2004, BMC Bioinformatics.

[79]  V. Kustanovich,et al.  Transmission disequilibrium testing of dopamine-related candidate gene polymorphisms in ADHD: confirmation of association of ADHD with DRD4 and DRD5 , 2004, Molecular Psychiatry.

[80]  K. Bretonnel Cohen,et al.  The Compositional Structure of Gene Ontology Terms , 2003, Pacific Symposium on Biocomputing.

[81]  Linda A. Watson,et al.  Information Retrieval: A Health and Biomedical Perspective. , 2005 .