Educational Question Answering based on Social Media Content

We analyze the requirements for an educational Question Answering (QA) system operating on social media content. As a result, we identify a set of advanced natural language processing (NLP) technologies to address the challenges in educational QA. We conducted an inter-annotator agreement study on subjective question classification in the Yahoo!Answers social Q&A site and propose a simple, but effective approach to automatically identify subjective questions. We also developed a two-stage QA architecture for answering learners' questions. In the first step, we aim at re-using human answers to already answered questions by employing question paraphrase identification [1]. In the second step, we apply information retrieval techniques to perform answer retrieval from social media content. We show that elaborate techniques for question preprocessing are crucial.

[1]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[2]  Vibhu O. Mittal,et al.  Bridging the lexical chasm: statistical approaches to answer-finding , 2000, SIGIR '00.

[3]  Eduard H. Hovy,et al.  Question Answering in Webclopedia , 2000, TREC.

[4]  Jennifer Chu-Carroll,et al.  Statistical answer-type identification in open-domain question answering , 2002 .

[5]  Otis Gospodnetic,et al.  Lucene in Action , 2004 .

[6]  W. Bruce Croft,et al.  Finding similar questions in large question and answer archives , 2005, CIKM '05.

[7]  Valentin Jijkoun,et al.  Retrieving answers from frequently asked questions pages on the web , 2005, CIKM '05.

[8]  Claire Cardie,et al.  Multi-Perspective Question Answering Using the OpQA Corpus , 2005, HLT.

[9]  Jihie Kim,et al.  An intelligent discussion-bot for answering student queries in threaded discussions , 2006, IUI '06.

[10]  Max Mühlhäuser,et al.  Analyzing and accessing Wikipedia as a lexical semantic resource , 2007 .

[11]  Max Mühlhäuser,et al.  Automatically Assessing the Post Quality in Online Discussions on Software , 2007, ACL.

[12]  Arthur C. Graesser,et al.  Experiments on Generating Questions About Facts , 2009, CICLing.

[13]  Yi Liu,et al.  Statistical Machine Translation for Query Expansion in Answer Retrieval , 2007, ACL.

[14]  Iryna Gurevych,et al.  Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary , 2008, LREC.

[15]  Delphine Bernhard,et al.  Generating High Quality Questions from Low Quality Questions , 2008 .

[16]  Eugene Agichtein,et al.  Exploring question subjectivity prediction in community QA , 2008, SIGIR '08.

[17]  W. Bruce Croft,et al.  Retrieval models for question and answer archives , 2008, SIGIR '08.

[18]  Yong Yu,et al.  Searching Questions by Identifying Question Topic and Question Focus , 2008, ACL.

[19]  Iryna Gurevych,et al.  Answering Learners’ Questions by Retrieving Question Paraphrases from Social Q&A Sites , 2008 .

[20]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[21]  Iryna Gurevych,et al.  Annotating Question Types in Social Q&A Sites , 2009 .