Mining Stack Exchange: Expertise Is Evident from Initial Contributions

Stack Exchange is a very popular Question and Answer internet community. Users can post questions on a wide variety of topics, other users provide answers, usually within minutes. Participants are not compensated for their services and anyone can freely gain value from the efforts of the users, Stack Exchange is therefore a gift economy. Users, however, do gain reputation points when other users " upvote" their questions and/or answers. Stack Exchange thus functions as a learning community with a strong reputation-seeking element that creates a valuable public good, viz. the question and answer archive. The incentive structure of the community suggests that over time, the quality of the product (viz., delivered answers) steadily improves, and furthermore, that any individual who durably participates in this community for an extended period also would enjoy an increase in the quality of their output (viz., the answers they provide). We investigate the validity of these widely held beliefs in greater detail, using data downloaded from Stack Exchange. Our analysis indicates that these intuitions are actually not supported by the data, indeed the data suggests that overall answer scores decrease, and that people's tenure with the community is unrelated to the quality of their answers. Most interestingly, we show that answering skill, i.e. getting high average answer scores, which is different than reputation, is evident from the start and persists during one's tenure with the community. Conversely, people providing low rated answers are likely to have done so from the start.

[1]  Joseph A. Konstan,et al.  Expert identification in community question answering: exploring question selection bias , 2010, CIKM '10.

[2]  Ravi Kumar,et al.  Evolution of two-sided markets , 2010, WSDM '10.

[3]  Hans-Georg Müller,et al.  A stickiness coefficient for longitudinal data , 2012, Comput. Stat. Data Anal..

[4]  E. Wenger Communities of practice: learning as a social system , 1998 .

[5]  Benjamin V. Hanrahan,et al.  Modeling problem difficulty and expertise in stackoverflow , 2012, CSCW.

[6]  E. Wenger Communities of Practice: Learning, Meaning, and Identity , 1998 .

[7]  Christoph Treude,et al.  Measuring API documentation on the web , 2011, Web2SE '11.

[8]  Premkumar T. Devanbu,et al.  Ownership, experience and defects: a fine-grained study of authorship , 2011, 2011 33rd International Conference on Software Engineering (ICSE).

[9]  Young-In Song,et al.  Competition-based user expertise score estimation , 2011, SIGIR.

[10]  Julita Vassileva,et al.  Toward Social Learning Environments , 2008, IEEE Transactions on Learning Technologies.

[11]  Premkumar T. Devanbu,et al.  Got Issues? Do New Features and Code Improvements Affect Defects? , 2011, 2011 18th Working Conference on Reverse Engineering.

[12]  Hewijin Christine Jiau,et al.  Facing up to the inequality of crowdsourced API documentation , 2012, SOEN.

[13]  E. Churchill,et al.  Badges in Social Media: A Social Psychological Perspective , 2011 .

[14]  Georgia Koutrika,et al.  Questioning Yahoo! Answers , 2007 .

[15]  Michael C. Sturman,et al.  Searching for the Inverted U-Shaped Relationship Between Time and Performance: Meta-Analyses of the Experience/Performance, Tenure/Performance, and Age/Performance Relationships , 2003 .

[16]  Lena Mamykina,et al.  Design lessons from the fastest q&a site in the west , 2011, CHI.