论文信息 - Massive Corpus ~ 10 B Sentences Retrieved Sentences High Precision Claims Set Controversial Topic Queries Ranking and Boundary Detection Models Opponent Speech Opposing Claims Stance Detection Model Mentioned Claims Claim Matching Model Rebuttal

Massive Corpus ~ 10 B Sentences Retrieved Sentences High Precision Claims Set Controversial Topic Queries Ranking and Boundary Detection Models Opponent Speech Opposing Claims Stance Detection Model Mentioned Claims Claim Matching Model Rebuttal

Engaging in a live debate requires, among other things, the ability to effectively rebut arguments claimed by your opponent. In particular, this requires identifying these arguments. Here, we suggest doing so by automatically mining claims from a corpus of news articles containing billions of sentences, and searching for them in a given speech. This raises the question of whether such claims indeed correspond to those made in spoken speeches. To this end, we collected a large dataset of 400 speeches in English discussing 200 controversial topics, mined claims for each topic, and asked annotators to identify the mined claims mentioned in each speech. Results show that in the vast majority of speeches debaters indeed make use of such claims. In addition, we present several baselines for the automatic detection of mined claims in speeches, forming the basis for future work. All collected data is freely available for research.

[1] Ebru Arisoy,et al. Question Answering for Spoken Lecture Processing , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2] Shachar Mirkin,et al. Listening Comprehension over Argumentative Content , 2018, EMNLP.

[3] Shang-Ming Wang,et al. ODSQA: Open-Domain Spoken Question Answering Dataset , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[4] Noam Slonim,et al. Towards an argumentative content search engine using weak supervision , 2018, COLING.

[5] Hung-yi Lee,et al. Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension , 2018, INTERSPEECH.

[6] Shachar Mirkin,et al. A Recorded Debating Dataset , 2018, LREC.

[7] Noam Slonim,et al. Unsupervised corpus–wide claim detection , 2017, ArgMining@EMNLP.

[8] Indrajit Bhattacharya,et al. Stance Classification of Context-Dependent Claims , 2017, EACL.

[9] Shachar Mirkin,et al. Joint Learning of Correlated Sequence Labeling Tasks Using Bidirectional Recurrent Neural Networks , 2017, INTERSPEECH.

[10] Iryna Gurevych,et al. Parsing Argumentation Structures in Persuasive Essays , 2016, CL.

[11] Iryna Gurevych,et al. Argumentation Mining in User-Generated Web Discourse , 2016, CL.

[12] Lin-Shan Lee,et al. Hierarchical attention model for improved machine comprehension of spoken content , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[13] Lin-Shan Lee,et al. Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine , 2016, INTERSPEECH.

[14] Iryna Gurevych,et al. Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidirectional LSTM , 2016, ACL.

[15] Noam Slonim,et al. Context Dependent Claim Detection , 2014, COLING.

[16] Bob Carpenter,et al. The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.

[17] Andrew Y. Ng,et al. Parsing with Compositional Vector Grammars , 2013, ACL.

[18] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[19] Marie-Francine Moens,et al. Argumentation mining , 2011, Artificial Intelligence and Law.

[20] Lluís Màrquez i Villodre,et al. Using dependency parsing and machine learning for factoid question answering on spoken documents , 2010, INTERSPEECH.

[21] Evgeniy Gabrilovich,et al. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[22] Mihai Surdeanu,et al. Design and performance analysis of a factoid question answering system for spontaneous speech transcriptions , 2006, INTERSPEECH.

[23] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[24] Yann LeCun,et al. Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..