Competition-based user expertise score estimation

In this paper, we consider the problem of estimating the relative expertise score of users in community question and answering services (CQA). Previous approaches typically only utilize the explicit question answering relationship between askers and an-swerers and apply link analysis to address this problem. The im-plicit pairwise comparison between two users that is implied in the best answer selection is ignored. Given a question and answering thread, it's likely that the expertise score of the best answerer is higher than the asker's and all other non-best answerers'. The goal of this paper is to explore such pairwise comparisons inferred from best answer selections to estimate the relative expertise scores of users. Formally, we treat each pairwise comparison between two users as a two-player competition with one winner and one loser. Two competition models are proposed to estimate user expertise from pairwise comparisons. Using the NTCIR-8 CQA task data with 3 million questions and introducing answer quality prediction based evaluation metrics, the experimental results show that the pairwise comparison based competition model significantly outperforms link analysis based approaches (PageRank and HITS) and pointwise approaches (number of best answers and best answer ratio) for estimating the expertise of active users. Furthermore, it's shown that pairwise comparison based competi-tion models have better discriminative power than other methods. It's also found that answer quality (best answer) is an important factor to estimate user expertise.

[1]  Thomas Hofmann,et al.  TrueSkill™: A Bayesian Skill Rating System , 2007 .

[2]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[3]  Shengrui Wang,et al.  Identifying authoritative actors in question-answering forums: the case of Yahoo! answers , 2008, KDD.

[4]  Ee-Peng Lim,et al.  Quality-aware collaborative question answering: methods and evaluation , 2009, WSDM '09.

[5]  Paul P. Maglio,et al.  Expertise identification using email communications , 2003, CIKM '03.

[6]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[7]  Tom Minka,et al.  TrueSkillTM: A Bayesian Skill Rating System , 2006, NIPS.

[8]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[9]  David Mease,et al.  A Penalized Maximum Likelihood Approach for the Ranking of College Football Teams Independent of Victory Margins , 2003 .

[10]  Eugene Agichtein,et al.  Learning to recognize reliable users and content in social media with coupled mutual reinforcement , 2009, WWW '09.

[11]  A. Elo The rating of chessplayers, past and present , 1978 .

[12]  Noriko Kando,et al.  Overview of the NTCIR-8 Community QA Pilot Task (Part II): System Evaluation , 2010, NTCIR.

[13]  Young-In Song,et al.  Finding question-answer pairs from online forums , 2008, SIGIR '08.

[14]  Noriko Kando,et al.  Using graded-relevance metrics for evaluating community QA answer selection , 2011, WSDM '11.

[15]  Ben Carterette,et al.  Learning a ranking from pairwise preferences , 2006, SIGIR '06.

[16]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.

[17]  Joseph A. Konstan,et al.  Expert identification in community question answering: exploring question selection bias , 2010, CIKM '10.

[18]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[19]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[20]  Damon Horowitz,et al.  The anatomy of a large-scale social search engine , 2010, WWW '10.

[21]  Irwin King,et al.  Routing questions to appropriate answerers in community question answering services , 2010, CIKM.

[22]  Mark S. Ackerman,et al.  Activity Lifespan: An Analysis of User Survival Patterns in Online Knowledge Sharing Communities , 2010, ICWSM.

[23]  Hongyuan Zha,et al.  Co-ranking Authors and Documents in a Heterogeneous Network , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[24]  Eugene Agichtein,et al.  Discovering authorities in question answer communities by using link analysis , 2007, CIKM '07.