Social networks of Wikipedia

Wikipedia, the free online encyclopedia anyone can edit, is a live social experiment: millions of individuals volunteer their knowledge and time to collective create it. It is hence interesting trying to understand how they do it. While most of the scholar attention focused on article pages, a less investigated share of activities happen on user talk pages, Wikipedia pages where a message can be left for the specific user. This public conversations can be studied from a Social Network Analysis perspective in order to highlight the structure of the "talk" network. In this paper we focus on this preliminary extraction step by proposing different algorithms. We then empirically validate the differences in the networks they generate on the Venetian Wikipedia with the real network of conversations extracted manually by coding every message left on all user talk pages. The comparisons show that both the algorithms and the manual process contain inaccuracies that are intrinsic in the freedom and unpredictability of Wikipedia syntax and practices. Nevertheless, a precise description of the involved issues allows to make informed decisions and to base empirical findings on reproducible evidence. Our goal is to lay the foundation for a solid computational sociology of wikis. For this reason we release the scripts encoding our algorithms as open source and also some datasets extracted out of Wikipedia conversations, in order to let other researchers replicate and improve our initial effort.

[1]  Martin Wattenberg,et al.  Proceedings of the 40th Hawaii International Conference on System Sciences- 2007 Talk Before You Type: Coordination in Wikipedia , 2022 .

[2]  Martin Halvey,et al.  Exploring social dynamics in online media sharing , 2007, WWW '07.

[3]  Aniket Kittur,et al.  Us vs. Them: Understanding Social Dynamics in Wikipedia with Revert Graph Visualizations , 2007, 2007 IEEE Symposium on Visual Analytics Science and Technology.

[4]  Jure Leskovec,et al.  Governance in Social Media: A Case Study of the Wikipedia Promotion Process , 2010, ICWSM.

[5]  Iryna Gurevych,et al.  Analysis of the Wikipedia Category Graph for NLP Applications , 2007 .

[6]  Jon M. Kleinberg,et al.  Feedback effects between similarity and social influence in online communities , 2008, KDD.

[7]  Péter Schönhofen,et al.  Identifying Document Topics Using the Wikipedia Category Network , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[8]  Dan Cosley,et al.  Finding social roles in Wikipedia , 2011, iConference.

[9]  Aniket Kittur,et al.  Beyond Wikipedia: coordination and conflict in online production groups , 2010, CSCW '10.

[10]  R. Bonato Network Analysis for Wikipedia , 2005 .

[11]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[12]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[13]  V. Zlatic,et al.  Wikipedias: collaborative web-based encyclopedias as complex networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Ulrik Brandes,et al.  Network analysis of collaboration structure in Wikipedia , 2009, WWW '09.

[15]  Guido Caldarelli,et al.  Preferential attachment in the growth of social networks: the case of Wikipedia , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  G. Caldarelli,et al.  Preferential attachment in the growth of social networks, the Internet encyclopedia wikipedia , 2007 .

[17]  Masahiro Kimura,et al.  Blocking links to minimize contamination spread in a social network , 2009, TKDD.

[18]  P. Gloor,et al.  Analyzing the Creative Editing Behavior of Wikipedia Editors: Through Dynamic Social Network Analysis , 2010 .