论文信息 - Emory University at TREC LiveQA 2016: Combining Crowdsourcing and Learning-To-Rank Approaches for Real-Time Complex Question Answering

Emory University at TREC LiveQA 2016: Combining Crowdsourcing and Learning-To-Rank Approaches for Real-Time Complex Question Answering

This paper describes the two QA systems we developed to participate in the TREC LiveQA 2016 shared task. The first run represents an improvement of our fully automatic real-time QA system from LiveQA 2015, Emory-QA. The second run, Emory-CRQA, which stands for Crowd-powered Real-time Question Answering, incorporates human feedback, in real-time, to improve answer candidate generation and ranking. The base Emory-QA system uses the title and the body of a question to query Yahoo! Answers, Answers.com, WikiHow and general web search and retrieve a set of candidate answers along with their topics and contexts. This information is used to represent each candidate by a set of features, rank them with a trained LambdaMART model, and return the top ranked candidates as an answer to the question. The second run, Emory-CRQA, integrates a crowdsourcing module, which provides the system with additional answer candidates and quality ratings, obtained in near real-time (under one minute) from a crowd of workers When Emory-CRQA receives a question, it is forwarded to the crowd, who can start working on the answer in parallel with the automatic pipeline. When the automatic pipeline is done generating and ranking candidates, a subset of them is immediately sent to the same workers who have been working on answering the questions. Workers then rate the quality of all humanor system-generated candidate answers. The resulting ratings, as well as original system scores, are used as features for the final re-ranking module, which returns the highest scoring answer. The official run results of the tasks indicate promising improvements for both runs compared to the best performing system from LiveQA 2015. Additionally, they demonstrate the effectiveness of the introduced crowdsourcing module, which allowed us to achieve an improvement of ∼20% in average answer score over a fully automatic Emory-QA system.

Eugene Agichtein | Denis Savenkov | Eugene Agichtein | Denis Savenkov

[1] Christopher J. C. Burges,et al. From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[2] Jeffrey Nichols,et al. Chorus: a crowd-powered conversational assistant , 2013, UIST.

[3] Denis Savenkov. Ranking Answers and Web Passages for Non-factoid Question Answering: Emory University at TREC LiveQA , 2015, TREC.

[4] Di Wang,et al. CMU OAQA at TREC 2015 LiveQA: Discovering the Right Answer with Clues , 2015, TREC.

[5] Michael S. Bernstein,et al. Crowds in two seconds: enabling realtime crowd-powered interfaces , 2011, UIST.

[6] Eugene Agichtein,et al. CRQA: Crowd-Powered Real-Time Automatic Question Answering System , 2016, HCOMP.

[7] Donna K. Harman,et al. Overview of the TREC 2015 LiveQA Track , 2015, TREC.

[8] Eugene Agichtein,et al. Crowdsourcing for (almost) Real-time Question Answering , 2016 .

[9] Peter Fankhauser,et al. Boilerplate detection using shallow text features , 2010, WSDM '10.

[10] Rob Miller,et al. VizWiz: nearly real-time answers to visual questions , 2010, UIST.

[11] Mihai Surdeanu,et al. Learning to Rank Answers to Non-Factoid Questions from Web Collections , 2011, CL.