Bringing Representativeness into Social Media Monitoring and Analysis

The opinions, expectations and behavior of citizens are increasingly reflected online - therefore, mining the Internet for such data can enhance decision-making in public policy, communications, marketing, finance and other fields. However, to come closer to the representativeness of classic opinion surveys there is a lack of knowledge about the socio-demographic characteristics of those voicing opinions on the internet. This paper proposes to calibrate online opinions aggregated from multiple and heterogeneous data sources with traditional surveys enhanced with rich socio-demographic information to enable insights into which opinions are expressed on the Internet by specific segments of society. The goal of this research is to provide professionals in citizen- and consumer-centered domains with more concise near real-time intelligence on online opinions. To become effective, the methodologies presented in this paper must be integrated into a coherent decision support system.

[1]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[2]  Rob Malouf,et al.  A Preliminary Investigation into Sentiment Analysis of Informal Political Discourse , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[3]  Namhee Kwon,et al.  Multidimensional text analysis for eRulemaking , 2006, DG.O.

[4]  James S. Fishkin Realizing Deliberative Democracy: Strategies for Democratic Consultation , 2006 .

[5]  Thomas L. Griffiths,et al.  Nonparametric Latent Feature Models for Link Prediction , 2009, NIPS.

[6]  James S. Fishkin,et al.  Considered Opinions: Deliberative Polling in Britain , 2002 .

[7]  Claire Cardie,et al.  Using natural language processing to improve eRulemaking: project highlight , 2006, DG.O.

[8]  S. Fortunato,et al.  Scaling and universality in proportional elections. , 2006, Physical review letters.

[9]  Pawel Sobkowicz,et al.  Modelling Opinion Formation with Physics Tools: Call for Closer Link with Reality , 2009, J. Artif. Soc. Soc. Simul..

[10]  Ethan J. Leib,et al.  The search for deliberative democracy in China , 2006 .

[11]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[12]  Charles Elkan,et al.  Link Prediction via Matrix Factorization , 2011, ECML/PKDD.

[13]  Michael I. Jordan Graphical Models , 2003 .

[14]  Félix Moral-Toranzo,et al.  Anonymity effects in computer-mediated communication in the case of minority influence , 2007, Comput. Hum. Behav..

[15]  M. Laver,et al.  Extracting Policy Positions from Political Texts Using Words as Data , 2003, American Political Science Review.

[16]  Carlos Angel Iglesias,et al.  Linked Opinions: Describing Sentiments on the Structured Web of Data , 2011, SDoW@ISWC.

[17]  Stephen J. Wright,et al.  Dissimilarity in Graph-Based Semi-Supervised Classification , 2007, AISTATS.

[18]  Sean A. Munson,et al.  The Prevalence of Political Discourse in Non-Political Blogs , 2011, ICWSM.

[19]  Rohini K. Srihari,et al.  Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[20]  Yee Whye Teh,et al.  Bayesian Nonparametric Models , 2010, Encyclopedia of Machine Learning.

[21]  Rachana Shanbhogue,et al.  Using Internet Search Data as Economic Indicators , 2011 .

[22]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[23]  Le Song,et al.  Estimating time-varying networks , 2008, ISMB 2008.

[24]  ChengXiang Zhai,et al.  Generating comparative summaries of contradictory opinions in text , 2009, CIKM.

[25]  James P. Callan,et al.  Language processing technologies for electronic rulemaking: a project highlight , 2005, DG.O.