Automated inference of point of view from user interactions in collective intelligence venues

Empirical evaluation of trust and manipulation in large-scale collective intelligence processes is challenging. The datasets involved are too large for thorough manual study, and current automated options are limited. We introduce a statistical framework which classifies point of view based on user interactions. The framework works on Web-scale datasets and is applicable to a wide variety of collective intelligence processes. It enables principled study of such issues as manipulation, trustworthiness of information, and potential bias. We demonstrate the model's effectiveness in determining point of view on both synthetic data and a dataset of Wikipedia user interactions. We build a combined model of topics and points-of-view on the entire history of English Wikipedia, and show how it can be used to find potentially biased articles and visualize user interactions at a high level.

[1]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[2]  Luo Si,et al.  Mining contrastive opinions on political texts using cross-perspective topic model , 2012, WSDM '12.

[3]  Bamshad Mobasher,et al.  Towards Trustworthy Recommender Systems : An Analysis of Attack Models and Algorithm Robustness , 2007 .

[4]  Ambuj K. Singh,et al.  Towards Community Discovery in Signed Collaborative Interaction Networks , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[5]  Ruslan Salakhutdinov,et al.  Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.

[6]  A. Banerjee,et al.  Social Topic Models for Community Extraction , 2008 .

[7]  Yulan He,et al.  Joint sentiment/topic model for sentiment analysis , 2009, CIKM.

[8]  Aniket Kittur,et al.  He says, she says: conflict and coordination in Wikipedia , 2007, CHI.

[9]  Michael J. Paul,et al.  A Two-Dimensional Topic-Aspect Model for Discovering Multi-Faceted Topics , 2010, AAAI.

[10]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[11]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks with Experiments on Enron and Academic Email , 2007, J. Artif. Intell. Res..

[14]  Ortega Soto,et al.  Wikipedia: A quantitative analysis , 2012 .

[15]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[16]  Feng Yan,et al.  Parallel Inference for Latent Dirichlet Allocation on Graphics Processing Units , 2009, NIPS.

[17]  Sanmay Das,et al.  Manipulation among the Arbiters of Collective Intelligence , 2016, ACM Trans. Web.

[18]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[19]  L. Venkata Subramaniam,et al.  Probabilistic model for discovering topic based communities in social networks , 2011, CIKM '11.

[20]  Max Welling,et al.  Distributed Inference for Latent Dirichlet Allocation , 2007, NIPS.