论文信息 - Users Are Known by the Company They Keep: Topic Models for Viewpoint Discovery in Social Networks

Users Are Known by the Company They Keep: Topic Models for Viewpoint Discovery in Social Networks

Social media platforms such as weblogs and social networking sites provide Internet users with an unprecedented means to express their opinions and debate on a wide range of issues. Concurrently with their growing importance in public communication, social media platforms may foster echo chambers and filter bubbles: homophily and content personalization lead users to be increasingly exposed to conforming opinions. There is therefore a need for unbiased systems able to identify and provide access to varied viewpoints. To address this task, we propose in this paper a novel unsupervised topic model, the Social Network Viewpoint Discovery Model (SNVDM). Given a specific issue (e.g., U.S. policy) as well as the text and social interactions from the users discussing this issue on a social networking site, SNVDM jointly identifies the issue's topics, the users' viewpoints, and the discourse pertaining to the different topics and viewpoints. In order to overcome the potential sparsity of the social network (i.e., some users interact with only a few other users), we propose an extension to SNVDM based on the Generalized Pólya Urn sampling scheme (SNVDM-GPU) to leverage "acquaintances of acquaintances" relationships. We benchmark the different proposed models against three baselines, namely TAM, SN-LDA, and VODUM, on a viewpoint clustering task using two real-world datasets. We thereby provide evidence that our model SNVDM and its extension SNVDM-GPU significantly outperform state-of-the-art baselines, and we show that utilizing social interactions greatly improves viewpoint clustering performance.

[1] A. Gionis,et al. antifying Controversy on Social Media , 2018 .

[2] Justin M. Rao,et al. Filter Bubbles, Echo Chambers, and Online News Consumption , 2016 .

[3] Jacob Ratkiewicz,et al. Predicting the Political Alignment of Twitter Users , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[4] Liu Yang,et al. Modeling interaction features for debate side clustering , 2013, CIKM.

[5] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6] Derek Ruths,et al. Classifying Political Orientation on Twitter: It's Not Easy! , 2013, ICWSM.

[7] Philip Resnik,et al. Tea Party in the House: A Hierarchical Ideal Point Topic Model and Its Application to Republican Legislators in the 112th Congress , 2015, ACL.

[8] Georgina Kennedy,et al. Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection , 2016, Journal of medical Internet research.

[9] Lada A. Adamic,et al. The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[10] Craig MacDonald,et al. Topic-centric Classification of Twitter User's Political Orientation , 2015, FDIA.

[11] Ricardo Baeza-Yates,et al. Finding Intermediary Topics Between People of Opposing Views: A Case Study , 2015, SPS@SIGIR.

[12] David M. Mimno,et al. Comparing Apples to Apple: The Effects of Stemmers on Topic Models , 2016, TACL.

[13] Wendy Liu,et al. Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors , 2012, ICWSM.

[14] Aaron Smith,et al. Cell Phones, Social Media and Campaign 2014 , 2014 .

[15] Ana-Maria Popescu,et al. Detecting controversial events from twitter , 2010, CIKM.