Ontology-Based Named Entity Recognizer for Behavioral Health

Named-Entity Recognizers (NERs) are an important part of information extraction systems in annotation tasks. Although substantial progress has been made in recognizing domain-independent named entities (e.g. location,organizationandperson),thereisaneedtorecognize named entities for domain-specific applications in order to extract relevant concepts. Due to the growing need for smart health applications in order to address some of the latest worldwide epidemics of behavioral issues (e.g. over eating, lack of exercise, alcohol and drug consumption), we focused on the domain of behavior change, especially lifestyle change. To the best of our knowledge, there is no named-entity recognizer designed for the lifestyle change domain to enable applications to recognize relevant concepts. We describe the design of an ontology for behavioral health based on which we developed a NER augmented with lexical resources. Our NER automatically tags words and phrases in sentences with relevant (lifestyle) domain-specific tags(e.g.[un/]healthyfood,potentially-risky/healthyactivity, drug, tobacco and alcoholic beverage). We discuss the evaluation that we conducted with with manually collected test data. In addition, we discuss how our ontology enables systems to make further information acquisition for the recognized named entities by using semantic reasoners.

[1]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[2]  David Dagan Feng,et al.  Improving News Video Annotation with Semantic Context , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[3]  McGinnis Jm,et al.  Actual causes of death in the United States. , 1993 .

[4]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[5]  J. Matarazzo Behavioral health and behavioral medicine: frontiers for a new health psychology. , 1980, The American psychologist.

[6]  Bernardo Magnini,et al.  A WordNet-Based Approach to Named Entites Recognition , 2002, COLING 2002.

[7]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[8]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[9]  Holger Knublauch,et al.  The Protégé OWL Plugin: An Open Development Environment for Semantic Web Applications , 2004, SEMWEB.

[10]  Walter C Willett,et al.  Balancing Life-Style and Genomics Research for Disease Prevention , 2002, Science.

[11]  Deborah L. McGuinness,et al.  OWL Web ontology language overview , 2004 .

[12]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[13]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[14]  George R. Krupka,et al.  IsoQuest Inc.: Description of the NetOwl™ Extractor System as Used for MUC-7 , 1998, MUC.

[15]  Kalina Bontcheva,et al.  Ontology-Based Information Extraction for Business Intelligence , 2007, ISWC/ASWC.

[16]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[17]  Dietrich Klakow,et al.  A Gold Standard for Relation Extraction in the Food Domain , 2012, LREC.

[18]  P. Blackman Actual causes of death in the United States. , 1994, JAMA.

[19]  J. Gerberding,et al.  Actual causes of death in the United States, 2000. , 2004, JAMA.

[20]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[21]  George R. Krupka,et al.  IsoQuest Inc.: Description of the NetOwl , 1998, Message Understanding Conference.

[22]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.