Peer-production system or collaborative ontology engineering effort: what is Wikidata?

Wikidata promises to reduce factual inconsistencies across all Wikipedia language versions. It will enable dynamic data reuse and complex fact queries within the world's largest knowledge database. Studies of the existing participation patterns that emerge in Wikidata are only just beginning. What delineates most of the contributions in the system has not yet been investigated. Is it an inheritance from the Wikipedia peer-production system or the proximity of tasks in Wikidata that have been studied in collaborative ontology engineering? As a first step to answering this question, we performed a cluster analysis of participants' content editing activities. This allowed us to blend our results with typical roles found in peer-production and collaborative ontology engineering projects. Our results suggest very specialised contributions from a majority of users. Only a minority, which is the most active group, participate all over the project. These users are particularly responsible for developing the conceptual knowledge of Wikidata. We show the alignment of existing algorithmic participation patterns with these human patterns of participation. In summary, our results suggest that Wikidata rather supports peer-production activities caused by its current focus on data collection. We hope that our study informs future analyses and developments and, as a result, allows us to build better tools to support contributors in peer-production-based ontology engineering.

[1]  Robert P. Cook,et al.  Freebase: A Shared Database of Structured General Human Knowledge , 2007, AAAI.

[2]  Ivan Beschastnikh,et al.  Articulations of wikiwork: uncovering valued work in wikipedia through barnstars , 2008, CSCW.

[3]  Oded Nov,et al.  Functional Roles and Career Paths in Wikipedia , 2015, CSCW.

[4]  B. Shneiderman,et al.  The Reader-to-Leader Framework: Motivating Technology-Mediated Social Participation , 2009 .

[5]  E. Wenger Communities of Practice: Learning, Meaning, and Identity , 1998 .

[6]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[7]  Markus Krötzsch,et al.  Semantic Wikipedia , 2006, WikiSym '06.

[8]  Audris Mockus,et al.  A case study of open source software development: the Apache server , 2000, Proceedings of the 2000 International Conference on Software Engineering. ICSE 2000 the New Millennium.

[9]  Aniket Kittur,et al.  He says, she says: conflict and coordination in Wikipedia , 2007, CHI.

[10]  Jesús M. González-Barahona,et al.  Evolution of the core team of developers in libre software projects , 2009, 2009 6th IEEE International Working Conference on Mining Software Repositories.

[11]  Etienne Wenger,et al.  Situated Learning: Legitimate Peripheral Participation , 1991 .

[12]  Jesús M. González-Barahona,et al.  Quantitative analysis of thewikipedia community of users , 2007, WikiSym '07.

[13]  Kouichi Kishida,et al.  Evolution patterns of open-source software systems and communities , 2002, IWPSE '02.

[14]  James D. Herbsleb,et al.  Work-to-rule: the emergence of algorithmic governance in Wikipedia , 2013, C&T '13.

[15]  Aaron Halfaker,et al.  Using edit sessions to measure participation in wikipedia , 2013, CSCW.

[16]  Aaron Halfaker,et al.  When the levee breaks: without bots, what happens to Wikipedia's quality control processes? , 2013, OpenSym.

[17]  Sudha Ram,et al.  Who does what: Collaboration patterns in the wikipedia and their impact on article quality , 2011, TMIS.

[18]  Kouichi Kishida,et al.  Toward an understanding of the motivation of open source software developers , 2003, 25th International Conference on Software Engineering, 2003. Proceedings..

[19]  Csongor Nyulas,et al.  Using Semantic Web in ICD-11: Three Years Down the Road , 2013, SEMWEB.

[20]  John Riedl,et al.  How oversight improves member-maintained communities , 2005, CHI.

[21]  Dan Cosley,et al.  Finding social roles in Wikipedia , 2011, iConference.

[22]  Amy Bruckman,et al.  Becoming Wikipedian: transformation of participation in a collaborative online encyclopedia , 2005, GROUP.

[23]  Elena Paslaru Bontas Simperl,et al.  Collaborative ontology engineering: a survey , 2013, The Knowledge Engineering Review.

[24]  Felipe Ortega,et al.  Quantitative Analysis of the Wikipedia Community of Users , 2007 .

[25]  Markus Strohmaier,et al.  How ontologies are made: Studying the hidden social dynamics behind collaborative ontology engineering projects , 2013, J. Web Semant..

[26]  B. Hammond Ontology , 2004, Lawrence Booth’s Book of Visions.

[27]  Ling Liu,et al.  Encyclopedia of Database Systems , 2009, Encyclopedia of Database Systems.

[28]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[29]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[30]  Brian S. Butler,et al.  Don't look now, but we've created a bureaucracy: the nature and roles of policies and rules in wikipedia , 2008, CHI.

[31]  Isabelle Guyon,et al.  A Stability Based Method for Discovering Structure in Clustered Data , 2001, Pacific Symposium on Biocomputing.

[32]  E. Wenger Communities of practice: learning as a social system , 1998 .

[33]  Denny Vrandecic,et al.  Semantic Wikis: Approaches, Applications, and Perspectives , 2012, Reasoning Web.

[34]  Yolanda Gil,et al.  Knowledge capture in the wild: a perspective from semantic wiki communities , 2013, K-CAP.

[35]  Michael Günther,et al.  Introducing Wikidata to the Linked Data Web , 2014, SEMWEB.

[36]  Samson W. Tu,et al.  Supporting Collaborative Ontology Development in Protégé , 2008, SEMWEB.

[37]  Tania Tudorache,et al.  An analysis of collaborative patterns in large-scale ontology development projects , 2011, K-CAP '11.