Automatic Reputation Assessment in Wikipedia

The online encyclopedia Wikipedia is predominantly created by anonymous or pseudonymous authors whose knowledge and motivations are unknown. For that reason there is an uncertainty in terms of their contribution quality. An approach to this problem is provided by automatic reputation systems, which have been becoming a new research branch in the recent years. In previous research, different metrics for automatic reputation assessment have been suggested. Nevertheless, the metrics are evaluated insufficiently and considered isolated only. As a result, the significance of these metrics is quite unclear. In this paper, we compare and assess seven metrics, both originated from the literature and new suggestions. Additionally, we combine these metrics via a discriminant analysis to deduce a significant reputation function. The analysis reveals that our newly suggested metric editing efficiency is particularly effective. We validate our reputation function by means of an analysis of Wikipedia user groups.

[1]  Bart Goethals,et al.  Automatic Vandalism Detection in Wikipedia : Towards a Machine Learning Approach , 2008 .

[2]  Edmund A. Mennis The Wisdom of Crowds: Why the Many Are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations , 2006 .

[3]  Aniket Kittur,et al.  Harnessing the wisdom of crowds in wikipedia: quality through coordination , 2008, CSCW.

[4]  Luca de Alfaro,et al.  A content-driven reputation system for the wikipedia , 2007, WWW '07.

[5]  Bo Leuf,et al.  The Wiki Way: Quick Collaboration on the Web , 2001 .

[6]  Martin Wattenberg,et al.  Proceedings of the 40th Hawaii International Conference on System Sciences- 2007 Talk Before You Type: Coordination in Wikipedia , 2022 .

[7]  Andrew Lih,et al.  Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource , 2004 .

[8]  Joshua Evan Blumenstock,et al.  Size matters: word count as a measure of quality on wikipedia , 2008, WWW.

[9]  Krishnendu Chatterjee,et al.  Assigning trust to Wikipedia content , 2008, Int. Sym. Wikis.

[10]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[11]  Peter J. Denning,et al.  Wikipedia risks , 2005, CACM.

[12]  Deborah L. McGuinness,et al.  Computing trust from revision history , 2006, PST.

[13]  John Riedl,et al.  Creating, destroying, and restoring value in wikipedia , 2007, GROUP.

[14]  J. W. Hunt,et al.  An Algorithm for Differential File Comparison , 2008 .

[15]  Klaus Stein,et al.  Does it matter who contributes: a study on featured articles in the german wikipedia , 2007, HT '07.

[16]  Stephen Barrett,et al.  Computational Trust in Web Content Quality: A Comparative Evalutation on the Wikipedia Project , 2007, Informatica.

[17]  Insup Lee,et al.  Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata? , 2010, EUROSEC '10.

[18]  Sean W. Smith,et al.  The Quality of Open Source Production: Zealots and Good Samaritans in the Case of Wikipedia , 2007 .

[19]  Alan Julian Izenman,et al.  Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning , 2008 .

[20]  Deborah L. McGuinness,et al.  Mining Revision History to Assess Trustworthiness of Article Fragments , 2006, 2006 International Conference on Collaborative Computing: Networking, Applications and Worksharing.

[21]  Benno Stein,et al.  Automatic Vandalism Detection in Wikipedia , 2008, ECIR.

[22]  Ee-Peng Lim,et al.  Measuring Qualities of Articles Contributed by Online Communities , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[23]  James Surowiecki The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies, societies, and nations Doubleday Books. , 2004 .

[24]  Les Gasser,et al.  Assessing Information Quality of a Community-Based Encyclopedia , 2005, ICIQ.

[25]  T. Wöhner Automatic Editing Rights Management in Wikipedia , 2012 .

[26]  Ian Witten,et al.  Data Mining , 2000 .

[27]  Thomas Wöhner,et al.  Assessing the quality of Wikipedia articles with lifecycle based metrics , 2009, Int. Sym. Wikis.

[28]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[29]  Martin Wattenberg,et al.  Studying cooperation and conflict between authors with history flow visualizations , 2004, CHI.