Vote calibration in community question-answering systems

User votes are important signals in community question-answering (CQA) systems. Many features of typical CQA systems, e.g. the best answer to a question, status of a user, are dependent on ratings or votes cast by the community. In a popular CQA site, Yahoo! Answers, users vote for the best answers to their questions and can also thumb up or down each individual answer. Prior work has shown that these votes provide useful predictors for content quality and user expertise, where each vote is usually assumed to carry the same weight as others. In this paper, we analyze a set of possible factors that indicate bias in user voting behavior -- these factors encompass different gaming behavior, as well as other eccentricities, e.g., votes to show appreciation of answerers. These observations suggest that votes need to be calibrated before being used to identify good answers or experts. To address this problem, we propose a general machine learning framework to calibrate such votes. Through extensive experiments based on an editorially judged CQA dataset, we show that our supervised learning method of content-agnostic vote calibration can significantly improve the performance of answer ranking and expert ranking.

[1]  Mihai Surdeanu,et al.  Learning to Rank Answers on Large Online QA Collections , 2008, ACL.

[2]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[3]  Eugene Agichtein,et al.  Discovering authorities in question answer communities by using link analysis , 2007, CIKM '07.

[4]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[5]  Brian D. Davison,et al.  Incorporating Participant Reputation in Community-Driven Question Answering Systems , 2009, 2009 International Conference on Computational Science and Engineering.

[6]  Noriko Kando,et al.  Overview of the NTCIR-8 Community QA Pilot Task (Part I): The Test Collection and the Task , 2010, NTCIR.

[7]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[8]  Shengrui Wang,et al.  Identifying authoritative actors in question-answering forums: the case of Yahoo! answers , 2008, KDD.

[9]  Ee-Peng Lim,et al.  Quality-aware collaborative question answering: methods and evaluation , 2009, WSDM '09.

[10]  Noriko Kando,et al.  Using graded-relevance metrics for evaluating community QA answer selection , 2011, WSDM '11.

[11]  Dan Feng,et al.  Ranking community answers by modeling question-answer relationships via analogical reasoning , 2009, SIGIR.

[12]  Alton Yeow-Kuan Chua,et al.  What Makes a High-Quality User-Generated Answer? , 2011, IEEE Internet Computing.

[13]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[14]  Eugene Agichtein,et al.  Learning to recognize reliable users and content in social media with coupled mutual reinforcement , 2009, WWW '09.

[15]  Young-In Song,et al.  Competition-based user expertise score estimation , 2011, SIGIR.

[16]  Young-In Song,et al.  Microsoft Research Asia with Redmond at the NTCIR-8 Community QA Pilot Task , 2010, NTCIR.

[17]  Byron Dom,et al.  A Bayesian Technique for Estimating the Credibility of question Answerers , 2008, SDM.

[18]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[19]  Joseph A. Konstan,et al.  Expert identification in community question answering: exploring question selection bias , 2010, CIKM '10.