Evaluating answer quality across knowledge domains: Using textual and non‐textual features in social Q&A

As an increasing important source of information and knowledge, social questioning and answering sites (social Q&A) have attracted significant attention from industry and academia, as they address the challenge of evaluating and predicting the quality of answers on such sites. However, few previous studies examined the answer quality by considering knowledge domains or topics as a potential factor. To fill this gap, a model consisting of 24 textual and non‐textual features of answers was developed in this study to evaluate and predict answer quality for social Q&A, and the model was applied to identify and compare useful features for predicting high‐quality answers across four knowledge domains, including science, technology, art, and recreation. The findings indicate that review and user features are the most powerful indicators of high‐quality answers regardless of knowledge domains, while the usefulness of textual features (length, structure, and writing style) varies across different knowledge domains. In the future, the findings could be applied to automatically assessing answer quality and quality control in social Q&A.

[1]  Linh Hoang,et al.  A Model for Evaluating the Quality of User-Created Documents , 2008, AIRS.

[2]  James W. Pennebaker,et al.  Predicting the perceived quality of online mathematics contributions from users' reputations , 2011, CHI.

[3]  Pável Calado,et al.  Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia , 2009, JCDL '09.

[4]  Iryna Gurevych,et al.  A Multi-Dimensional Model for Assessing the Quality of Answers in Social Q&A Sites , 2009, ICIQ.

[5]  Sanghee Oh,et al.  Users' relevance criteria for evaluating answers in a social Q&A site , 2009, J. Assoc. Inf. Sci. Technol..

[6]  Ee-Peng Lim,et al.  Measuring article quality in wikipedia: models and evaluation , 2007, CIKM '07.

[7]  Les Gasser,et al.  Assessing Information Quality of a Community-Based Encyclopedia , 2005, ICIQ.

[8]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[9]  Rich Gazan Specialists and synthesists in a question answering community , 2006, ASIST.

[10]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[11]  Andy P. Field,et al.  Discovering Statistics Using SPSS , 2000 .

[12]  Pável Calado,et al.  Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow , 2013, SIGIR.

[13]  Brian Mingus,et al.  Exploring the Feasibility of Automatically Rating Online Article Quality , 2007 .

[14]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[15]  Pável Calado,et al.  Automatic Assessment of Document Quality in Web Collaborative Digital Libraries , 2011, JDIQ.

[16]  Benno Stein,et al.  Identifying featured articles in wikipedia: writing style matters , 2010, WWW '10.

[17]  Joshua Evan Blumenstock,et al.  Size matters: word count as a measure of quality on wikipedia , 2008, WWW.

[18]  Andrew Lih,et al.  Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource , 2004 .

[19]  Bernardo A. Huberman,et al.  Cooperation and quality in wikipedia , 2007, WikiSym '07.