论文信息 - Improving Spoken Question Answering Using Contextualized Word Representation

Improving Spoken Question Answering Using Contextualized Word Representation

While question answering (QA) systems have witnessed great breakthroughs in reading comprehension (RC) tasks, spoken question answering (SQA) is still a much less investigated area. Previous work shows that existing SQA systems are limited by catastrophic impact of automatic speech recognition (ASR) errors [1] and the lack of large-scale real SQA datasets [2]. In this paper, we propose using contextualized word representations to mitigate the effects of ASR errors and pretraining on existing textual QA datasets to mitigate the data scarcity issue. New state-of-the-art results have been achieved using contextualized word representations on both the artificially synthesised and real SQA benchmark data sets, with 21.5 EM/18.96 F1 score improvement over the sub-word unit based baseline on the Spoken-SQuAD [1] data, and 13.11 EM/10.99 F1 score improvement on the ODSQA data [2]. By further fine-tuning pre-trained models with existing large scaled textual QA data, we obtained 38.12 EM/34.1 F1 improvement over the baseline of fine-tuned only on small sized real SQA data.

Pascale Fung | Dan Su | Pascale Fung | Dan Su

[1] Lin-Shan Lee,et al. Spoken question answering using tree-structured conditional random fields and two-layer random walk , 2014, INTERSPEECH.

[2] Sanja Fidler,et al. MovieQA: Understanding Stories in Movies through Question-Answering , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Shang-Ming Wang,et al. ODSQA: Open-Domain Spoken Question Answering Dataset , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[4] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[5] Philip Bachman,et al. NewsQA: A Machine Comprehension Dataset , 2016, Rep4NLP@ACL.

[6] Ebru Arisoy,et al. Question Answering for Spoken Lecture Processing , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] Hung-yi Lee,et al. Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8] Yoshua Bengio,et al. HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[9] Kyunghyun Cho,et al. SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[10] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12] Yuting Lai,et al. DRCD: a Chinese Machine Reading Comprehension Dataset , 2018, ArXiv.

[13] Lin-Shan Lee,et al. Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine , 2016, INTERSPEECH.

[14] Hung-yi Lee,et al. Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension , 2018, INTERSPEECH.

[15] Ali Farhadi,et al. Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[16] Wei Fang,et al. Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .

[18] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.