Predicting information seeker satisfaction in community question answering

Question answering communities such as Naver and Yahoo! Answers have emerged as popular, and often effective, means of information seeking on the web. By posting questions for other participants to answer, information seekers can obtain specific answers to their questions. Users of popular portals such as Yahoo! Answers already have submitted millions of questions and received hundreds of millions of answers from other participants. However, it may also take hours --and sometime days-- until a satisfactory answer is posted. In this paper we introduce the problem of predicting information seeker satisfaction in collaborative question answering communities, where we attempt to predict whether a question author will be satisfied with the answers submitted by the community participants. We present a general prediction model, and develop a variety of content, structure, and community-focused features for this task. Our experimental results, obtained from a largescale evaluation over thousands of real questions and user ratings, demonstrate the feasibility of modeling and predicting asker satisfaction. We complement our results with a thorough investigation of the interactions and information seeking patterns in question answering communities that correlate with information seeker satisfaction. Our models and predictions could be useful for a variety of applications such as user intent inference, answer ranking, interface design, and query suggestion and routing.

[1]  Jimmy J. Lin,et al.  Methods for automatically evaluating answers to complex questions , 2006, Information Retrieval.

[2]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[3]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[4]  Jimmy J. Lin,et al.  Deconstructing nuggets: the stability and reliability of complex question answering evaluation , 2007, SIGIR.

[5]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[6]  Ryen W. White,et al.  Studying the use of popular destinations to enhance web search interaction , 2007, SIGIR.

[7]  Ian Witten,et al.  Data Mining , 2000 .

[8]  Qi Su,et al.  Internet-scale collection of human-reviewed data , 2007, WWW '07.

[9]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[10]  Daniel E. Rose,et al.  Understanding user goals in web search , 2004, WWW '04.

[11]  Ellen M. Voorhees,et al.  Overview of the TREC 2004 Novelty Track. , 2005 .

[12]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.

[13]  Filip Radlinski,et al.  Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.

[14]  Jimmy J. Lin,et al.  Overview of the TREC 2007 Question Answering Track , 2008, TREC.

[15]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[16]  Doug Downey,et al.  Models of Searching and Browsing: Languages, Studies, and Application , 2007, IJCAI.

[17]  Nicholas J. Belkin,et al.  Ask for Information Retrieval: Part II. Results of a Design Study , 1982, J. Documentation.

[18]  J. Ross Quinlan,et al.  Improved Use of Continuous Attributes in C4.5 , 1996, J. Artif. Intell. Res..

[19]  Susan T. Dumais,et al.  An Analysis of the AskMSR Question-Answering System , 2002, EMNLP.

[20]  Koichi Takeda,et al.  Information retrieval on the web , 2000, CSUR.

[21]  Stephen P. Harter,et al.  Evaluation of information retrieval systems : Approaches, issues, and methods , 1997 .

[22]  Gilad Mishne,et al.  YR-2007-005 FINDING HIGH-QUALITY CONTENT IN SOCIAL MEDIA WITH AN APPLICATION TO COMMUNITY-BASED QUESTION ANSWERING , 2007 .

[23]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[24]  Philip S. Yu,et al.  Identifying the influential bloggers in a community , 2008, WSDM '08.

[25]  Eugene Agichtein,et al.  You've Got Answers: Towards Personalized Models for Predicting Success in Community Question Answering , 2008, ACL.

[26]  Eric Brill,et al.  Automatic Question Answering: Beyond the Factoid , 2004, NAACL.

[27]  Ellen M. Voorhees,et al.  The Philosophy of Information Retrieval Evaluation , 2001, CLEF.

[28]  Eugene Agichtein,et al.  On the evolution of the yahoo! answers QA community , 2008, SIGIR '08.

[29]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[30]  Jimmy J. Lin,et al.  Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.

[31]  Ralf Bierig,et al.  Intra-assessor consistency in question answering , 2007, SIGIR.

[32]  Ryen W. White,et al.  WWW 2007 / Track: Browsers and User Interfaces Session: Personalization Investigating Behavioral Variability in Web Search , 2022 .

[33]  Edward Cutrell,et al.  Eye tracking in MSN Search: Investigating snippet length, target position and task types , 2007 .