Dynamics of Content Quality in Collaborative Knowledge Production

We explore the dynamics of user performance in collaborative knowledge production by studying the quality of answers to questions posted on Stack Exchange. We propose four indicators of answer quality: answer length, the number of code lines and hyperlinks to external web content it contains, and whether it is accepted by the asker as the most helpful answer to the question. Analyzing millions of answers posted over the period from 2008 to 2014, we uncover regular short-term and long-term changes in quality. In the short-term, quality deteriorates over the course of a single session, with each successive answer becoming shorter, with fewer code lines and links, and less likely to be accepted. In contrast, performance improves over the long-term, with more experienced users producing higher quality answers. These trends are not a consequence of data heterogeneity, but rather have a behavioral origin. Our findings highlight the complex interplay between short-term deterioration in performance, potentially due to mental fatigue or attention depletion, and long-term performance improvement due to learning and skill acquisition, and its impact on the quality of user-generated content.

[1]  A. Vespignani,et al.  Competition among memes in a world with limited attention , 2012, Scientific Reports.

[2]  Aniket Kittur,et al.  Harnessing the wisdom of crowds in wikipedia: quality through coordination , 2008, CSCW.

[3]  Efthimis N. Efthimiadis,et al.  Analyzing and evaluating query reformulation strategies in web search logs , 2009, CIKM.

[4]  Sudha Ram,et al.  Who does what: Collaboration patterns in the wikipedia and their impact on article quality , 2011, TMIS.

[5]  Rossano Schifanella,et al.  Link Creation and Profile Alignment in the aNobii Social Network , 2010, 2010 IEEE Second International Conference on Social Computing.

[6]  Stephen Barrett,et al.  Computational Trust in Web Content Quality: A Comparative Evalutation on the Wikipedia Project , 2007, Informatica.

[7]  Kristina Lerman,et al.  Evidence of Online Performance Deterioration in User Sessions on Reddit , 2016, PloS one.

[8]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[9]  Anne-France Kogan,et al.  Lost in transportation , 2013 .

[10]  Kristina Lerman,et al.  Twitter Session Analytics: Profiling Users' Short-Term Behavioral Changes , 2016, SocInfo.

[11]  Roger Tourangeau,et al.  Eye-Tracking Data: New Insights on Response Order Effects and Other Cognitive Shortcuts in Survey Responding. , 2008, Public opinion quarterly.

[12]  Kristina Lerman,et al.  The myopia of crowds: Cognitive load and collective evaluation of answers on Stack Exchange , 2016, PloS one.

[13]  Kristina Lerman,et al.  How Visibility and Divided Attention Constrain Social Contagion , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[14]  Michael C. Frank,et al.  Eye Tracking Data , 2016 .

[15]  Thomas Wöhner,et al.  Assessing the quality of Wikipedia articles with lifecycle based metrics , 2009, Int. Sym. Wikis.

[16]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.