Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow

Collaborative web sites, such as collaborative encyclopedias, blogs, and forums, are characterized by a loose edit control, which allows anyone to freely edit their content. As a consequence, the quality of this content raises much concern. To deal with this, many sites adopt manual quality control mechanisms. However, given their size and change rate, manual assessment strategies do not scale and content that is new or unpopular is seldom reviewed. This has a negative impact on the many services provided, such as ranking and recommendation. To tackle with this problem, we propose a learning to rank (L2R) approach for ranking answers in Q&A forums. In particular, we adopt an approach based on Random Forests and represent query and answer pairs using eight different groups of features. Some of these features are used in the Q&A domain for the first time. Our L2R method was trained to learn the answer rating, based on the feedback users give to answers in Q&A forums. Using the proposed method, we were able (i) to outperform a state of the art baseline with gains of up to 21% in NDCG, a metric used to evaluate rankings; we also conducted a comprehensive study of the features, showing that (ii) review and user features are the most important in the Q&A domain although text features are useful for assessing quality of new answers; and (iii) the best set of new features we proposed was able to yield the best quality rankings.

[1]  Pável Calado,et al.  Automatic Assessment of Document Quality in Web Collaborative Digital Libraries , 2011, JDIQ.

[2]  J. Friedman Stochastic gradient boosting , 2002 .

[3]  Mihai Surdeanu,et al.  Learning to Rank Answers on Large Online QA Collections , 2008, ACL.

[4]  Kilian Q. Weinberger,et al.  Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.

[5]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[6]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[7]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[8]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.

[9]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[10]  R. Gunning The Technique of Clear Writing. , 1968 .

[11]  Yoav Freund,et al.  The Alternating Decision Tree Learning Algorithm , 1999, ICML.

[12]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.

[13]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[14]  Les Gasser,et al.  Assessing Information Quality of a Community-Based Encyclopedia , 2005, ICIQ.

[15]  Michael R. Lyu,et al.  Analyzing and predicting question quality in community question answering services , 2012, WWW.

[16]  Harith Alani,et al.  Automatic Identification of Best Answers in Online Enquiry Communities , 2012, ESWC.

[17]  G. Harry McLaughlin,et al.  SMOG Grading - A New Readability Formula. , 1969 .

[18]  Felice Dell'Orletta,et al.  Multilingual Dependency Parsing and Domain Adaptation using DeSR , 2007, EMNLP.

[19]  Sanford Ressler,et al.  Perspectives on electronic publishing - standards, solutions, and more , 1993 .

[20]  E A Smith,et al.  Automated readability index. , 1967, AMRL-TR. Aerospace Medical Research Laboratories.

[21]  Jon M Kleinberg,et al.  Hubs, authorities, and communities , 1999, CSUR.

[22]  M. Coleman,et al.  A computer readability formula designed for machine scoring. , 1975 .

[23]  R. Flesch A new readability yardstick. , 1948, The Journal of applied psychology.

[24]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[25]  Noriko Kando,et al.  Using graded-relevance metrics for evaluating community QA answer selection , 2011, WSDM '11.

[26]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[29]  Yasemin Altun,et al.  Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger , 2006, EMNLP.

[30]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.

[31]  Ee-Peng Lim,et al.  Quality-aware collaborative question answering: methods and evaluation , 2009, WSDM '09.

[32]  Brian Mingus,et al.  Exploring the Feasibility of Automatically Rating Online Article Quality , 2007 .

[33]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.