Evaluating the quality of educational answers in community question-answering

Community Question-Answering (CQA), where questions and answers are generated by peers, has become a popular method of information seeking in online environments. While the content repositories created through CQA sites have been used widely to support general purpose tasks, using them as online digital libraries that support educational needs is an emerging practice. Horizontal CQA services, such as Yahoo! Answers, and vertical CQA services, such as Brainly, are aiming to help students improve their learning process by answering their educational questions. In these services, receiving high quality answer(s) to a question is a critical factor not only for user satisfaction, but also for supporting learning. However, the questions are not necessarily answered by experts, and the askers may not have enough knowledge and skill to evaluate the quality of the answers they receive. This could be problematic when students build their own knowledge base by applying inaccurate information or knowledge acquired from online sources. Using moderators could alleviate this problem. However, a moderator's evaluation of answer quality may be inconsistent because it is based on their subjective assessments. Employing human assessors may also be insufficient due to the large amount of content available on a CQA site. To address these issues, we propose a framework for automatically assessing the quality of answers. This is achieved by integrating different groups of features - personal, community-based, textual, and contextual - to build a classification model and determine what constitutes answer quality. To test this evaluation framework, we collected more than 10 million educational answers posted by more than 3 million users on Brainly's United States and Poland sites. The experiments conducted on these datasets show that the model using Random Forest (RF) achieves more than 83% accuracy in identifying high quality of answers. In addition, the findings indicate that personal and community-based features have more prediction power in assessing answer quality. Our approach also achieves high values on other key metrics such as F1-score and Area under ROC curve. The work reported here can be useful in many other contexts where providing automatic quality assessment in a digital repository of textual information is paramount.

[1]  Jennifer Preece,et al.  The top five reasons for lurking: improving community experiences for everyone , 2004, Comput. Hum. Behav..

[2]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[3]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[4]  Ivan Srba,et al.  Askalot: Community Question Answering as a Means for Knowledge Sharing in an Educational Organization , 2015, CSCW Companion.

[5]  Eugene Agichtein,et al.  Predicting information seeker satisfaction in community question answering , 2008, SIGIR '08.

[6]  Hanghang Tong,et al.  MET: A Fast Algorithm for Minimizing Propagation in Large Graphs with Small Eigen-Gaps , 2015, SDM.

[7]  R. Cole Issues in Web-Based Pedagogy: A Critical Primer , 2000 .

[8]  Ee-Peng Lim,et al.  Quality-aware collaborative question answering: methods and evaluation , 2009, WSDM '09.

[9]  Chirag Shah,et al.  Developing a typology of online Q&A models and recommending the right model for each question type , 2012, ASIST.

[10]  Idan Szpektor,et al.  Will My Question Be Answered? Predicting "Question Answerability" in Community Question-Answering Sites , 2013, ECML/PKDD.

[11]  Mihai Surdeanu,et al.  Learning to Rank Answers on Large Online QA Collections , 2008, ACL.

[12]  Pável Calado,et al.  Quality assessment of collaborative content with minimal information , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[13]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[14]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.

[15]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[16]  N. Hari Narayanan,et al.  Facilitating students' collaboration and learning in a question and answer system , 2013, CSCW '13.

[17]  Chirag Shah,et al.  Retrieving Rising Stars in Focused Community Question-Answering , 2016, ACIIDS.

[18]  Gang Wang,et al.  Wisdom in the social crowd: an analysis of quora , 2013, WWW.

[19]  Erik Choi,et al.  USER MOTIVATION AND EXPECTATION FOR ASKING A QUESTION IN ONLINE Q&A SERVICES , 2014 .

[20]  Yong Yu,et al.  Analyzing and Predicting Not-Answered Questions in Community-based Question Answering Services , 2011, AAAI.

[21]  Paul A. Tess The role of social media in higher education classes (real and virtual) - A literature review , 2013, Comput. Hum. Behav..

[22]  Chirag Shah,et al.  "How much change do you get from 40$?" - Analyzing and addressing failed questions on social Q&A , 2012, ASIST.

[23]  Joann J. Ordille,et al.  Querying Heterogeneous Information Sources Using Source Descriptions , 1996, VLDB.

[24]  Ina Fourie,et al.  Conducting the Reference Interview: A How‐to‐Do‐it Manual for Librarians , 2003 .

[25]  S. Yang Information Seeking as Problem-Solving Using a Qualitative Approach To Uncover the Novice Learners' Information-Seeking Processes in a Perseus Hypertext System. , 1997 .

[26]  Danai Koutra,et al.  Network similarity via multiple social theories , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[27]  Geert-Jan Houben,et al.  Identification of useful user comments in social media: a case study on flickr commons , 2013, JCDL '13.

[28]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[29]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[30]  Chirag Shah,et al.  User motivations for asking questions in online Q&A services , 2016, J. Assoc. Inf. Sci. Technol..

[31]  Rich Gazan,et al.  Social Q&A , 2011, J. Assoc. Inf. Sci. Technol..

[32]  Erik Choi,et al.  Utilizing content moderators to investigate critical factors for assessing the quality of answers on brainly, social learning Q&A platform for students: A pilot study , 2015, ASIST.

[33]  C. Lee Giles,et al.  Ranking experts using author-document-topic graphs , 2013, JCDL '13.

[34]  J. Oh,et al.  Research agenda for social Q&A , 2009 .

[35]  Chirag Shah,et al.  Social Q&A and virtual reference - comparing apples and oranges with the help of experts and users , 2012, J. Assoc. Inf. Sci. Technol..

[36]  Feng Xu,et al.  Predicting long-term impact of CQA posts: a comprehensive viewpoint , 2014, KDD.