Relationship Between Coherence of Sequential Events and Dialogue Continuity in Conversational Response

雑談対話システムの評価指標として,ユーザとの対話を継続させる働きを表す,対話 継続性が挙げられる.対話モデルの先行研究において,対話継続性の向上には,シ ステム発話の一貫性が重要であると考えられている.そこで本論文では,対話モデ ルより生成された応答候補を,対話中に含まれる事態の一貫性に基づいてリランキ ングする手法を提案する.提案手法は対話に含まれる事態の一貫性(「ストレスが溜 まる」と「発散する」は関連した事態である,など)を考慮することで,選択され る応答の一貫性,対話継続性の向上を図る.本研究では異なる 2つの手法を考案し た.一つ目の手法は統計的に獲得された因果関係ペアとのマッチングにより,対話 中の事態の一貫性を考慮し,二つ目の手法は Coherence Model によって,対話の一 貫性を考慮する.自動評価の結果,これらの手法では応答中の単語選択の観点では 一貫性は向上していることが確認された.一方で,人手評価の結果では,応答の主 観的な一貫性は明確に向上しないものの,一つ目の方法により対話継続性が向上す るという,一見して矛盾する結果が確認された.この結果より一貫性と対話継続性 の関係について,人手評価結果の相関分析,事例分析を行った.これらの分析結果 より,人手評価において主観的な一貫性の向上は対話継続性の向上にあまり寄与し ないことが確認された.また,対話履歴に対して一貫する事態を選択できている場 合には対話継続性が向上することが示唆された. キーワード:対話システム,事態,リランキング,対話継続性

[1]  Kôiti Hasida,et al.  Towards an ISO Standard for Dialogue Act Annotation , 2010, LREC.

[2]  Daisuke Kawahara,et al.  A Fully-Lexicalized Probabilistic Model for Japanese Syntactic and Case Structure Analysis , 2006, HLT-NAACL.

[3]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[4]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[5]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[6]  Peter Jansen,et al.  Discourse Complements Lexical Semantics for Non-factoid Answer Reranking , 2014, ACL.

[7]  Jong-Hoon Oh,et al.  Why-Question Answering using Intra- and Inter-Sentential Causal Relations , 2013, ACL.

[8]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[9]  Ryo Nakamura,et al.  Another Diversity-Promoting Objective Function for Neural Dialogue Generation , 2018, ArXiv.

[10]  Jennifer Foster,et al.  This is how we do it: Answer Reranking for Open-domain How Questions with Paragraph Vectors and Minimal Feature Engineering , 2016, NAACL.

[11]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[12]  Alexander I. Rudnicky,et al.  Ravenclaw: dialog management using hierarchical task decomposition and an expectation agenda , 2003, INTERSPEECH.

[13]  Matthew R. Walter,et al.  Coherent Dialogue with Attention-Based Language Models , 2016, AAAI.

[14]  Satoshi Nakamura,et al.  Japanese Dialogue Corpus of Information Navigation and Attentive Listening Annotated with Extended ISO-24617-2 Dialogue Act Tags , 2018, LREC.

[15]  Kôiti Hasida,et al.  ISO 24617-2: A semantically-based standard for dialogue annotation , 2012, LREC.

[16]  Sadao Kurohashi,et al.  A Large Scale Database of Strongly-related Events in Japanese , 2014, LREC.

[17]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[18]  Jackie Chi Kit Cheung,et al.  A Cross-Domain Transferable Neural Coherence Model , 2019, ACL.

[19]  Kenji Araki,et al.  EVALUATION OF UTTERANCES BASED ON CAUSAL KNOWLEDGE RETRIEVED FROM BLOGS , 2011 .

[20]  Satoshi Nakamura,et al.  Conversational Response Re-ranking Based on Event Causality and Role Factored Tensor Event Embedding , 2019, Proceedings of the First Workshop on NLP for Conversational AI.

[21]  Sadao Kurohashi,et al.  Acquiring Strongly-related Events using Predicate-argument Co-occurring Statistics and Case Frames , 2011, IJCNLP.

[22]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[23]  Jong-Hoon Oh,et al.  Multi-Column Convolutional Neural Networks with Causality-Attention for Why-Question Answering , 2017, WSDM.

[24]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[25]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[26]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[27]  Jong-Hoon Oh,et al.  A Semi-Supervised Learning Approach to Why-Question Answering , 2016, AAAI.

[28]  Evgeny A. Stepanov,et al.  Coherence Models for Dialogue , 2018, INTERSPEECH.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[31]  Yi Pan,et al.  Conversational AI: The Science Behind the Alexa Prize , 2018, ArXiv.

[32]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[33]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[34]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[35]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[36]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[37]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[38]  Joelle Pineau,et al.  Bootstrapping Dialog Systems with Word Embeddings , 2014 .

[39]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[40]  Maxine Eskénazi,et al.  Context-Aware Dialog Re-Ranking for Task-Oriented Dialog Systems , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[41]  Sadao Kurohashi,et al.  A Discriminative Approach to Japanese Zero Anaphora Resolution with Large-scale Lexicalized Case Frames , 2011, IJCNLP.

[42]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.