Using Linked Data for polarity classification of patients' experiences

Polarity classification is the main subtask of sentiment analysis and opinion mining, well-known problems in natural language processing that have attracted increasing attention in recent years. Existing approaches mainly rely on the subjective part of text in which sentiment is expressed explicitly through specific words, called sentiment words. These approaches, however, are still far from being good in the polarity classification of patients' experiences since they are often expressed without any explicit expression of sentiment, but an undesirable or desirable effect of the experience implicitly indicates a positive or negative sentiment. This paper presents a method for polarity classification of patients' experiences of drugs using domain knowledge. We first build a knowledge base of polar facts about drugs, called FactNet, using extracted patterns from Linked Data sources and relation extraction techniques. Then, we extract generalized semantic patterns of polar facts and organize them into a hierarchy in order to overcome the missing knowledge issue. Finally, we apply the extracted knowledge, i.e., polar fact instances and generalized patterns, for the polarity classification task. Different from previous approaches for personal experience classification, the proposed method explores the potential benefits of polar facts in domain knowledge aiming to improve the polarity classification performance, especially in the case of indirect implicit experiences, i.e., experiences which express the effect of one entity on other ones without any sentiment words. Using our approach, we have extracted 9703 triplets of polar facts at a precision of 92.26 percent. In addition, experiments on drug reviews demonstrate that our approach can achieve 79.78 percent precision in polarity classification task, and outperforms the state-of-the-art sentiment analysis and opinion mining methods.

[1]  Paolo Rosso,et al.  On the difficulty of automatically detecting irony: beyond a simple case of negation , 2014, Knowledge and Information Systems.

[2]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[3]  Lynn Eaton,et al.  Europeans and Americans turn to internet for health information , 2002 .

[4]  Erik Cambria,et al.  Sentic patterns: Dependency-based rules for concept-level sentiment analysis , 2014, Knowl. Based Syst..

[5]  Lynn Eaton,et al.  A third of Europeans and almost half of Americans use internet for health information , 2002, BMJ : British Medical Journal.

[6]  Rui Xia,et al.  Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[7]  Vagelis Hristidis,et al.  Pharmaceutical drugs chatter on Online Social Networks , 2014, J. Biomed. Informatics.

[8]  Ziqi Zhang,et al.  LODIE: Linked Open Data for Web-scale Information Extraction , 2012, SWAIE.

[9]  Saif Mohammad,et al.  CROWDSOURCING A WORD–EMOTION ASSOCIATION LEXICON , 2013, Comput. Intell..

[10]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[11]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[12]  Jiawei Han,et al.  Applications of Pattern Discovery Using Sequential Data Mining , 2011 .

[13]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[14]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[15]  Fred Popowich,et al.  From a children's first dictionary to a lexical knowledge base of conceptual graphs , 1997 .

[16]  Erik Cambria,et al.  Sentic PROMs: Application of sentic computing to the development of a novel unified framework for measuring health-care quality , 2012, Expert Syst. Appl..

[17]  Josef Steinberger,et al.  Supervised sentiment analysis in Czech social media , 2014, Inf. Process. Manag..

[18]  Paul Wicks,et al.  The power of social networking in medicine , 2009, Nature Biotechnology.

[19]  Olivier Bodenreider,et al.  Exploring semantic groups through visual approaches , 2003, J. Biomed. Informatics.

[20]  Deepak Yalamanchi Sideffective - system to mine patient reviews: sentiment analysis , 2011 .

[21]  Iryna Gurevych,et al.  Sentence and Expression Level Annotation of Opinions in User-Generated Discourse , 2010, ACL.

[22]  Xuanjing Huang,et al.  Phrase Dependency Parsing for Opinion Mining , 2009, EMNLP.

[23]  Hao Yu,et al.  Healthy or Harmful? Polarity Analysis Applied to Biomedical Entity Relationships , 2012, PRICAI.

[24]  Egon L. Willighagen,et al.  Linked open drug data for pharmaceutical research and development , 2011, J. Cheminformatics.

[25]  Kentaro Inui,et al.  Experience Mining: Building a Large-Scale Database of Personal Experiences and Opinions from Web Documents , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[26]  Tiziano Flati,et al.  Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project , 2014, ACL.

[27]  Jürgen Broß,et al.  Aspect-Oriented Sentiment Analysis of Customer Reviews Using Distant Supervision Techniques , 2013 .

[28]  Hui Yang,et al.  A Verb-Centric Approach for Relationship Extraction in Biomedical Text , 2010, 2010 IEEE Fourth International Conference on Semantic Computing.

[29]  Christopher S. G. Khoo,et al.  Sentiment lexicons for health-related opinion mining , 2012, IHI '12.

[30]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[31]  Hui Yang,et al.  Mining Biomedical Text towards Building a Quantitative Food-Disease-Gene Network , 2011, Learning Structure and Schemas from Documents.

[32]  J. Fleiss,et al.  Statistical methods for rates and proportions , 1973 .

[33]  Simone Paolo Ponzetto,et al.  BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network , 2012, Artif. Intell..

[34]  A. Darzi,et al.  Machine learning and sentiment analysis of unstructured free-text information about patient experience online , 2012, The Lancet.

[35]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[36]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[37]  Diana Inkpen,et al.  Opinion Learning from Medical Forums , 2013, RANLP.

[38]  Victoria Bobicev,et al.  Sentiments and Opinions in Health-related Web messages , 2011, RANLP.

[39]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[40]  William P Murphy Using Supervised Learning To Identify Descriptions Of Personal Experiences Related To Chronic Disease On Social Media , 2014 .

[41]  Kentaro Inui,et al.  Mining personal experiences and opinions from Web documents , 2011, Web Intell. Agent Syst..

[42]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[43]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[44]  Schubert Foo,et al.  Sentiment Classification of Drug Reviews Using a Rule-Based Linguistic Approach , 2012, ICADL.

[45]  Erik Cambria,et al.  Sentic Computing: Techniques, Tools, and Applications , 2012 .

[46]  Annie Chen,et al.  Patient Experience in Online Support Forums: Modeling Interpersonal Interactions and Medication Use , 2013, ACL.

[47]  Ronen Feldman,et al.  Techniques and applications for sentiment analysis , 2013, CACM.

[48]  A. Sheth,et al.  Discovering Fine-grained Sentiment in Suicide Notes , 2012, Biomedical informatics insights.

[49]  Valentin Jijkoun,et al.  Mining User Experiences from Online Forums: An Exploration , 2010, HLT-NAACL 2010.

[50]  Ophir Frieder,et al.  A framework for detecting public health trends with Twitter , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).