Wikipedia vandalism detection

Wikipedia is an online encyclopedia that anyone can access and edit. It has become one of the most important sources of knowledge online and many third party projects rely on it for a wide-range of purposes. The open model of Wikipedia allows pranksters, lobbyists and spammers to attack the integrity of the encyclopedia and this endangers it as a public resource. This is known in the community as vandalism. A plethora of methods have been developed within the Wikipedia and the scientific community to tackle this problem. We have participated in this effort and developed one of the leading approaches. Our research aims to create a fully-working antivandalism system and get it working in the real world.

[1]  Deborah L. McGuinness,et al.  Computing trust from revision history , 2006, PST.

[2]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[3]  Martin Potthast,et al.  Crowdsourcing a wikipedia vandalism corpus , 2010, SIGIR.

[4]  Paolo Rosso,et al.  Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features , 2011, CICLing.

[5]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[6]  Luca de Alfaro,et al.  A content-driven reputation system for the wikipedia , 2007, WWW '07.

[7]  Santiago Moisés Mola-Velasco,et al.  Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals - Lab Report for PAN at CLEF 2010 , 2012, CLEF.

[8]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[9]  Benno Stein,et al.  Automatic Vandalism Detection in Wikipedia , 2008, ECIR.

[10]  Deborah L. McGuinness,et al.  Investigations into Trust for Collaborative Information Repositories: A Wikipedia Case Study , 2006, MTW.

[11]  Martin Wattenberg,et al.  Studying cooperation and conflict between authors with history flow visualizations , 2004, CHI.

[12]  Yorick Wilks,et al.  A Closer Look at Skip-gram Modelling , 2006, LREC.

[13]  Padmini Srinivasan,et al.  Detecting Wikipedia vandalism with active learning and statistical language models , 2010, WICOW '10.

[14]  Emilio José Rodríguez Posada,et al.  AVBOT: detección y corrección de vandalismos en Wikipedia , 2010 .

[15]  Andrew McCallum,et al.  Learning to Predict the Quality of Contributions to Wikipedia , 2008 .

[16]  John Riedl,et al.  Creating, destroying, and restoring value in wikipedia , 2007, GROUP.

[17]  Bart Goethals,et al.  Automatic Vandalism Detection in Wikipedia : Towards a Machine Learning Approach , 2008 .

[18]  Martin Potthast,et al.  Overview of the 1st International Competition on Wikipedia Vandalism Detection , 2010, CLEF.

[19]  Insup Lee,et al.  Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata? , 2010, EUROSEC '10.

[20]  Charles L. A. Clarke,et al.  Using dynamic markov compression to detect vandalism in the wikipedia , 2009, SIGIR.