Vote Prediction on Comments in Social Polls

A poll consists of a question and a set of predefined answers from which voters can select. We present the new problem of vote prediction on comments, which involves determining which of these answers a voter selected given a comment she wrote after voting. To address this task, we exploit not only the information extracted from the comments but also extra-textual information such as user demographic information and inter-comment constraints. In an evaluation involving nearly one million comments collected from the popular SodaHead social polling website, we show that a vote prediction system that exploits only textual information can be improved significantly when extended with extra-textual information.

[1]  Josef C. Schrock,et al.  Discourse Markers in Spontaneous Speech: Oh What a Difference an Oh Makes , 1999 .

[2]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[3]  Frans A. J. Verstraten,et al.  The Effect of Substituting Discourse Markers on Their Role in Dialogue , 2010 .

[4]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[5]  Panagiotis Takis Metaxas,et al.  Limits of Electoral Predictions Using Twitter , 2011, ICWSM.

[6]  Claire Cardie,et al.  The Power of Negative Thinking: Exploiting Label Disagreement in the Min-cut Classification Framework , 2008, COLING.

[7]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[8]  Saif Mohammad,et al.  Tracking Sentiment in Mail: How Genders Differ on Emotional Axes , 2011, WASSA@ACL.

[9]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[10]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[11]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[12]  Gerhard Weikum,et al.  Harvesting facts from textual web sources by constrained label propagation , 2011, CIKM '11.

[13]  Gerhard Weikum,et al.  Coupling Label Propagation and Constraints for Temporal Fact Extraction , 2012, ACL.

[14]  Marilyn A. Walker,et al.  That is your evidence?: Classifying stance in online political debate , 2012, Decis. Support Syst..

[15]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[16]  Swapna Somasundaran,et al.  Recognizing Stances in Ideological On-Line Debates , 2010, HLT-NAACL 2010.

[17]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[18]  D. Rao Detecting Latent User Properties in Social Media , 2010 .

[19]  Vincent Ng,et al.  Stance Classification of Ideological Debates: Data, Models, Features, and Constraints , 2013, IJCNLP.

[20]  Marilyn A. Walker,et al.  Stance Classification using Dialogic Properties of Persuasion , 2012, NAACL.

[21]  Yue Lu,et al.  Unsupervised discovery of opposing opinion networks from forum discussions , 2012, CIKM '12.

[22]  Matt Thomas,et al.  Get out the vote: Determining support or opposition from Congressional floor-debate transcripts , 2006, EMNLP.

[23]  Timothy Baldwin,et al.  Collective Classification of Congressional Floor-Debate Transcripts , 2011, ACL.

[24]  Josef C. Schrock,et al.  Basic meanings of you know and I mean , 2002 .