A user-centric model of voting intention from Social Media

Social Media contain a multitude of user opinions which can be used to predict realworld phenomena in many domains including politics, finance and health. Most existing methods treat these problems as linear regression, learning to relate word frequencies and other simple features to a known response variable (e.g., voting intention polls or financial indicators). These techniques require very careful filtering of the input texts, as most Social Media posts are irrelevant to the task. In this paper, we present a novel approach which performs high quality filtering automatically, through modelling not just words but also users, framed as a bilinear model with a sparse regulariser. We also consider the problem of modelling groups of related output variables, using a structured multi-task regularisation method. Our experiments on voting intention prediction demonstrate strong performance over large-scale input from Twitter on two distinct case studies, outperforming competitive baselines.

[1]  Nello Cristianini,et al.  Effects of the recession on public mood in the UK , 2012, WWW.

[2]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[3]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[4]  Cindy K. Chung,et al.  The development and psychometric properties of LIWC2007 , 2007 .

[5]  Panagiotis Takis Metaxas,et al.  Limits of Electoral Predictions Using Twitter , 2011, ICWSM.

[6]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[7]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[8]  Daniel Gayo-Avello,et al.  No, You Cannot Predict Elections with Twitter , 2012, IEEE Internet Comput..

[9]  Vasileios Lampos On voting intentions inference from Twitter content: a case study on UK 2010 General Election , 2012, ArXiv.

[10]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[11]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[12]  Trevor Cohn,et al.  Trendminer: An Architecture for Real Time Analysis of Social Media Text , 2012, ICWSM 2012.

[13]  Brendan T. O'Connor,et al.  From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series , 2010, ICWSM.

[14]  Ignacio E. Grossmann,et al.  A global optimization algorithm for linear fractional and bilinear programs , 1995, J. Glob. Optim..

[15]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[16]  Charless C. Fowlkes,et al.  Bilinear classifiers for visual recognition , 2009, NIPS.

[17]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[18]  Nello Cristianini,et al.  Nowcasting Events from the Social Web with Statistical Learning , 2012, TIST.

[19]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[20]  Bernard J. Jansen,et al.  Twitter power: Tweets as electronic word of mouth , 2009, J. Assoc. Inf. Sci. Technol..

[21]  A. Smeaton,et al.  On Using Twitter to Monitor Political Sentiment and Predict Election Results , 2011 .

[22]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[23]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[24]  Nello Cristianini,et al.  Flu Detector - Tracking Epidemics on Twitter , 2010, ECML/PKDD.

[25]  Nello Cristianini,et al.  Tracking the flu pandemic by monitoring the social web , 2010, 2010 2nd International Workshop on Cognitive Information Processing.

[26]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[27]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[28]  James E. Falk,et al.  Jointly Constrained Biconvex Programming , 1983, Math. Oper. Res..

[29]  Panagiotis Takis Metaxas,et al.  How (Not) to Predict Elections , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.