Utterance Pair Scoring for Noisy Dialogue Data Filtering

Filtering noisy training data is one of the key approaches to improving the quality of neural network-based language generation. The dialogue research community especially suffers from a lack of less-noisy and sufficiently large data. In this work, we propose a scoring function that is specifically designed to identify low-quality utterance--response pairs to filter noisy training data. Our scoring function models the naturalness of the interconnection within dialogue pairs and their content-relatedness, which is based on previous findings in dialogue response generation and linguistics. We then demonstrate the effectiveness of our scoring function by confirming (i) the correlation between automatic scoring by the proposed function and human evaluation, and (ii) the performance of a dialogue response generator trained with filtered data. Furthermore, we experimentally confirm that our scoring function potentially works as a language-independent method.

[1]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[2]  Christopher Joseph Pal,et al.  Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.

[3]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[4]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[5]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[6]  Nan Jiang,et al.  LSDSCC: a Large Scale Domain-Specific Conversational Corpus for Response Generation with Diversity Oriented Evaluation Metrics , 2018, NAACL.

[7]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[8]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[9]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[10]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[11]  Yang Zhao,et al.  A Conditional Variational Framework for Dialog Generation , 2017, ACL.

[12]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[13]  Jörg Tiedemann,et al.  OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora , 2018, LREC.

[14]  Jack Sidnell,et al.  Conversation Analysis: List of tables , 2009 .

[15]  Bonnie L. Webber,et al.  Edina: Building an Open Domain Socialbot with Self-dialogues , 2017, ArXiv.

[16]  Alan Ritter,et al.  Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints , 2018, EMNLP.

[17]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[18]  Marco Marelli,et al.  A SICK cure for the evaluation of compositional distributional semantic models , 2014, LREC.

[19]  Richard Csaky,et al.  Improving Neural Conversational Models with Entropy-Based Data Filtering , 2019, ACL.

[20]  Kentaro Inui,et al.  Generating Stylistically Consistent Dialog Responses with Transfer Learning , 2017, IJCNLP.

[21]  Jiaxin Pei,et al.  S2SPMN: A Simple and Effective Framework for Response Generation with Relevant Information , 2018, EMNLP.

[22]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[23]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[24]  Marcin Junczys-Dowmunt,et al.  Dual Conditional Cross-Entropy Filtering of Noisy Parallel Corpora , 2018, WMT.

[25]  M. Marelli,et al.  SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.

[26]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[27]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[28]  Sanjeev Arora,et al.  A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.

[29]  Denny Britz,et al.  Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models , 2017, EMNLP.

[30]  Jianfeng Gao,et al.  A Neural Network Approach to Context-Sensitive Generation of Conversational Responses , 2015, NAACL.

[31]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[32]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[33]  Huda Khayrallah,et al.  Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering , 2018, WMT.

[34]  Lecture one rules of conversational sequence , 1989 .

[35]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[36]  Xiaoyu Shen,et al.  DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset , 2017, IJCNLP.

[37]  Matthew Henderson,et al.  A Repository of Conversational Datasets , 2019, Proceedings of the First Workshop on NLP for Conversational AI.

[38]  Joelle Pineau,et al.  Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses , 2017, ACL.

[39]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[40]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[41]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[42]  Jianfeng Gao,et al.  deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets , 2015, ACL.

[43]  Wei-Ying Ma,et al.  Topic Aware Neural Response Generation , 2016, AAAI.