The Semantics of a Definiendum Constrains both the Lexical Semantics and the Lexicosyntactic Patterns in the Definiens

Most current definitional question answering systems apply one-size-fits-all lexicosyntactic patterns to identify definitions. By analyzing a large set of online definitions, this study shows that the semantic types of definienda constrain both lexical semantics and lexicosyntactic patterns of the definientia. For example, "heart" has the semantic type [Body Part, Organ, or Organ Component] and its definition (e.g., "heart locates between the lungs") incorporates semantic-type-dependent lexicosyntactic patterns (e.g., "TERM locates ...") and terms (e.g., "lung" has the same semantic type [Body Part, Organ, or Organ Component]). In contrast, "AIDS" has a different semantic type [Disease or Syndrome]; its definition (e.g., "An infectious disease caused by human immunodeficiency virus") consists of different lexicosyntactic patterns (e.g., "...causes by...") and terms (e.g., "infectious disease" has the semantic type [Disease or Syndrome]). The semantic types are defined in the widely used biomedical knowledge resource, the Unified Medical Language System (UMLS).

[1]  Olivier Bodenreider,et al.  Characterizing the definitions of anatomical concep ts in WordNet and specialized sources , 2002 .

[2]  Hong Yu,et al.  Being Erlang Shen : Identifying Answerable Questions , 2005 .

[3]  Betsy L. Humphreys,et al.  Technical Milestone: The Unified Medical Language System: An Informatics Research Collaboration , 1998, J. Am. Medical Informatics Assoc..

[4]  Min-Yen Kan,et al.  Customization in a unified framework for summarizing medical literature , 2005, Artif. Intell. Medicine.

[5]  Padmini Srinivasan,et al.  Cross-language information retrieval with the UMLS metathesaurus , 1998, SIGIR '98.

[6]  Ellen Riloff,et al.  Automatically Generating Extraction Patterns from Untagged Text , 1996, AAAI/IAAI, Vol. 2.

[7]  Elena Filatova,et al.  Tell Me What You Do and I'll Tell You What You Are: Learning Occupation-Related Activities for Biographies , 2005, HLT/EMNLP.

[8]  Sanda M. Harabagiu,et al.  LCC Tools for Question Answering , 2002, TREC.

[9]  Thomas C. Rindflesch,et al.  EDGAR: extraction of drugs, genes and relations from the biomedical literature. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  Baining Guo,et al.  Real-time texture synthesis by patch-based sampling , 2001, TOGS.

[11]  F. R. Stannard AN INTRODUCTION TO SU . , 1969 .

[12]  Tat-Seng Chua,et al.  Generic soft pattern models for definitional question answering , 2005, SIGIR '05.

[13]  Susanne M. Humphrey,et al.  The NLM Indexing Initiative's Medical Text Indexer , 2004, MedInfo.

[14]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[15]  Luis Gravano,et al.  Snowball: extracting relations from large plain-text collections , 2000, DL '00.

[16]  Regina Barzilay,et al.  Automatic Processing of Spoken Dialogue in the Home Hemodialysis Domain , 2005, AMIA.

[17]  Dragomir R. Radev,et al.  Question-answering by predictive annotation , 2000, SIGIR '00.

[18]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[19]  Hong Yu,et al.  Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences , 2003, EMNLP.

[20]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[21]  Halil Kilicoglu,et al.  Abstraction Summarization for Managing the Biomedical Research Literature , 2004, HLT-NAACL 2004.

[22]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[23]  Sasha Blair-Goldensohn,et al.  Answering Definitional Questions: A Hybrid Approach , 2004, New Directions in Question Answering.

[24]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.