Quantifying Human-Perceived Answer Utility in Non-factoid Question Answering
暂无分享,去创建一个
W. Bruce Croft | Mark Sanderson | Falk Scholer | B. Barla Cambazoglu | Valeria Bolotova-Baranova | Leila Tavakoli
[1] Jin Zhang,et al. Multidimensional relevance modeling via psychometrics and crowdsourcing , 2014, SIGIR.
[2] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[3] Tefko Saracevic,et al. The Notion of Relevance in Information Science: Everybody knows what relevance is. But, what is it really? , 2016, The Notion of Relevance in Information Science.
[4] Pnina Fichman,et al. A comparative assessment of answer quality on four question answering sites , 2011, J. Inf. Sci..
[5] W. Bruce Croft,et al. Performance Prediction for Non-Factoid Question Answering , 2019, ICTIR.
[6] Yunjie Calvin Xu,et al. Relevance judgment: What do information users consider beyond topicality? , 2006, J. Assoc. Inf. Sci. Technol..
[7] Philipp Koehn,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016 .
[8] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[9] Wenhan Xiong,et al. TWEETQA: A Social Media Focused Question Answering Dataset , 2019, ACL.
[10] Iryna Gurevych,et al. A Multi-Dimensional Model for Assessing the Quality of Answers in Social Q&A Sites , 2009, ICIQ.
[11] Chirag Shah,et al. Building a parsimonious model for identifying best answers using interaction history in community Q&A , 2015, ASIST.
[12] Sanghee Oh,et al. Evaluating answer quality across knowledge domains: Using textual and non‐textual features in social Q&A , 2015, ASIST.
[13] Andrew B. Whinston,et al. Is Best Answer Really the Best Answer? The Politeness Bias , 2019, MIS Q..
[14] W. Bruce Croft,et al. Answer Interaction in Non-factoid Question Answering Systems , 2019, CHIIR.
[15] Jeffrey Pomerantz,et al. Evaluating and predicting answer quality in community QA , 2010, SIGIR.
[16] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.
[17] Pia Borlund,et al. The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..
[18] Feng Xu,et al. Detecting high-quality posts in community question answering sites , 2015, Inf. Sci..
[19] Jimmy J. Lin,et al. What Makes a Good Answer? The Role of Context in Question Answering , 2003, INTERACT.
[20] L. Given. Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community , 2015 .
[21] Berkant Barla Cambazoglu,et al. Linguistic Benchmarks of Online News Article Quality , 2016, ACL.
[22] Chirag Shah,et al. Evaluating the quality of educational answers in community question-answering , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).
[23] Percy Liang,et al. Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.
[24] Anita Sarma,et al. Perceptions of answer quality in an online technical question and answer forum , 2014, CHASE.
[25] K. Haerling,et al. Making Sense of Methods and Measurement: Spearman-Rho Ranked-Order Correlation Coefficient , 2014 .
[26] Jianfeng Gao,et al. A Human Generated MAchine Reading COmprehension Dataset , 2018 .
[27] Berkant Barla Cambazoglu,et al. A large-scale sentiment analysis for Yahoo! answers , 2012, WSDM '12.
[28] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.