Stance classification of multi-perspective consumer health information

While search engines are effective in answering direct factual questions such as, 'What are the symptoms of a disease X?', they are not so effective in addressing complex consumer health queries, which do not have a single definitive answer, such as, 'Is treatment X effective for disease Y?'. Instead, the users are presented with a vast number of search results with often contradictory perspectives and no definitive answer. We denote such queries as Multi-Perspective Consumer Health Information (MPCHI) queries for which there is no single 'Yes or No' answer. While ascertaining the credibility of the claims requires domain expertise, an efficient categorization of the search results according to their stance (support or oppose) to the queries will help the searcher in decision making. Hence, this paper focuses on the problem of stance classification for MPCHI data at sentence level, presenting a new data set for MPCHI queries. Unlike typical debate or argumentative text, the linguistic characteristics of MPCHI is quite different, with extensive use of scientific formal language and absence of opinion bearing words. Hence, such inherently different characteristic of MPCHI text requires going beyond traditional Bag of Words (BoW) features for stance classification. Hence, we propose using a rich non-traditional set of features such as medical semantic relations, stance vectors, sentiment polarity, textual entailment, and study their impact on MPCHI stance classification using an SVM and a neural network classifier. We find that using novel non-traditional features improves MPCHI stance classification performance over traditional BoW model by 24% for the SVM classifier, and 44% for the neural network classifier respectively, for the best feature combination.

[1]  Asher Stern,et al.  Design and realization of a modular architecture for textual entailment , 2013, Natural Language Engineering.

[2]  Claire Cardie,et al.  Multi-Level Structured Models for Document-Level Sentiment Classification , 2010, EMNLP.

[3]  Stan Matwin,et al.  From Argumentation Mining to Stance Classification , 2015, ArgMining@HLT-NAACL.

[4]  Marcelo Fiszman,et al.  The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[5]  Swapna Somasundaran,et al.  Recognizing Stances in Ideological On-Line Debates , 2010, HLT-NAACL 2010.

[6]  James Allan,et al.  Improving Automated Controversy Detection on the Web , 2016, SIGIR.

[7]  Andreas Vlachos,et al.  Emergent: a novel data-set for stance classification , 2016, NAACL.

[8]  Jacob Cohen,et al.  Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[9]  Carolyn Penstein Rosé,et al.  Making Conversational Structure Explicit: Identification of Initiation-response Pairs within Online Discussions , 2010, NAACL.

[10]  Elad Yom-Tov,et al.  Information is in the eye of the beholder: Seeking information on the MMR vaccine through an Internet search engine , 2014, AMIA.

[11]  Zornitsa Kozareva,et al.  Determining the Polarity and Source of Opinions Expressed in Political Debates , 2009, CICLing.

[12]  Raymond H. Putra,et al.  Support or Oppose? Classifying Positions in Online Debates from Reply Activities and Opinion Expressions , 2010, COLING.

[13]  Matt Thomas,et al.  Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[14]  Rob Malouf,et al.  Taking sides: user classification for informal online political discourse , 2008, Internet Res..

[15]  Elad Yom-Tov,et al.  Navigating Controversy as a Complex Search Task , 2015, SCST@ECIR.

[16]  Adam Faulkner,et al.  Automated Classification of Stance in Student Essays: An Approach Using Stance Target Information and the Wikipedia Link-Based Measure , 2014, FLAIRS.

[17]  Aijun An,et al.  Unsupervised Emotion Detection from Text Using Semantic and Syntactic Relations , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[18]  Marilyn A. Walker,et al.  Collective Stance Classification of Posts in Online Debate Forums , 2014 .

[19]  Shiri Dori-Hacohen Controversy Detection and Stance Analysis , 2015, SIGIR.

[20]  Shiri Dori-Hacohen,et al.  Detecting controversy on the web , 2013, CIKM.

[21]  Shiri Dori-Hacohen,et al.  Automated Controversy Detection on the Web , 2015, ECIR.

[22]  Owen Rambow,et al.  Identifying Justifications in Written Dialogs , 2011, 2011 IEEE Fifth International Conference on Semantic Computing.

[23]  Ramakrishnan Srikant,et al.  Mining newsgroups using networks arising from social behavior , 2003, WWW '03.

[24]  Marilyn A. Walker,et al.  Stance Classification using Dialogic Properties of Persuasion , 2012, NAACL.

[25]  Marilyn A. Walker,et al.  Cats Rule and Dogs Drool!: Classifying Stance in Online Debate , 2011, WASSA@ACL.

[26]  Timothy Baldwin,et al.  Collective Classification of Congressional Floor-Debate Transcripts , 2011, ACL.

[27]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[28]  Vincent Ng,et al.  Stance Classification of Ideological Debates: Data, Models, Features, and Constraints , 2013, IJCNLP.

[29]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..