Personalized Dialogue Generation with Diversified Traits

Endowing a dialogue system with particular personality traits is essential to deliver more human-like conversations. However, due to the challenge of embodying personality via language expression and the lack of large-scale persona-labeled dialogue data, this research problem is still far from well-studied. In this paper, we investigate the problem of incorporating explicit personality traits in dialogue generation to deliver personalized dialogues. To this end, firstly, we construct PersonalDialog, a large-scale multi-turn dialogue dataset containing various traits from a large number of speakers. The dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers. Each utterance is associated with a speaker who is marked with traits like Age, Gender, Location, Interest Tags, etc. Several anonymization schemes are designed to protect the privacy of each speaker. This large-scale dataset will facilitate not only the study of personalized dialogue generation, but also other researches on sociolinguistics or social science. Secondly, to study how personality traits can be captured and addressed in dialogue generation, we propose persona-aware dialogue generation models within the sequence to sequence learning framework. Explicit personality traits (structured by key-value pairs) are embedded using a trait fusion module. During the decoding process, two techniques, namely persona-aware attention and persona-aware bias, are devised to capture and address trait-related information. Experiments demonstrate that our model is able to address proper traits in different contexts. Case studies also show interesting results for this challenging research problem.

[1]  Yuta Tsuboi,et al.  Addressee and Response Selection for Multi-Party Conversation , 2016, EMNLP.

[2]  P. Eckert Age as a Sociolinguistic Variable , 2017 .

[3]  L. R. Goldberg The structure of phenotypic personality traits. , 1993, The American psychologist.

[4]  Marilyn A. Walker,et al.  An Annotated Corpus of Film Dialogue for Learning and Characterizing Character Style , 2012, LREC.

[5]  Stefanie Schurer,et al.  SEF Working paper : 12 / 2011 September 2011 The stability of big-five personality traits , 2011 .

[6]  Rui Zhang,et al.  Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs , 2017, AAAI.

[7]  Boi Faltings,et al.  Personalization in Goal-Oriented Dialog , 2017, ArXiv.

[8]  Marilyn A. Walker,et al.  A Personality-based Framework for Utterance Generation in Dialogue Applications , 2008, AAAI Spring Symposium: Emotion, Personality, and Social Behavior.

[9]  Jakob Grue Simonsen,et al.  A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion , 2015, CIKM.

[10]  W. T. Norman,et al.  Toward an adequate taxonomy of personality attributes: replicated factors structure in peer nomination personality ratings. , 1963, Journal of abnormal and social psychology.

[11]  Joelle Pineau,et al.  A Survey of Available Corpora for Building Data-Driven Dialogue Systems , 2015, Dialogue Discourse.

[12]  Alan Ritter,et al.  Data-Driven Response Generation in Social Media , 2011, EMNLP.

[13]  Mitsuru Ishizuka,et al.  Generating Dialogues for Virtual Agents Using Nested Textual Coherence Relations , 2008, IVA.

[14]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[15]  Gabriel Doyle,et al.  Mapping Dialectal Variation by Querying Social Media , 2014, EACL.

[16]  Atsuto Maki,et al.  A systematic study of the class imbalance problem in convolutional neural networks , 2017, Neural Networks.

[17]  Quoc V. Le,et al.  A Neural Conversational Model , 2015, ArXiv.

[18]  Neal Topp,et al.  Online Data Collection , 2002 .

[19]  E. Goffman The Presentation of Self in Everyday Life , 1959 .

[20]  Ting Liu,et al.  Neural personalized response generation as domain adaptation , 2017, World Wide Web.

[21]  Susan C. Herring,et al.  A Faceted Classification Scheme for Computer-Mediated Discourse , 2007 .

[22]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[23]  Dong Nguyen,et al.  Why Gender and Age Prediction from Tweets is Hard: Lessons from a Crowdsourcing Experiment , 2014, COLING.

[24]  Harry Shum,et al.  From Eliza to XiaoIce: challenges and opportunities with social chatbots , 2018, Frontiers of Information Technology & Electronic Engineering.

[25]  David H. P. Shulman The Presentation of Self in Contemporary Social Life , 2016 .

[26]  Xin Wang,et al.  Group Linguistic Bias Aware Neural Response Generation , 2017, SIGHAN@IJCNLP.

[27]  David Bamman,et al.  Gender identity and lexical variation in social media , 2012, 1210.4567.

[28]  Xiaoyan Zhu,et al.  Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory , 2017, AAAI.

[29]  Antoine Bordes,et al.  Training Millions of Personalized Dialogue Agents , 2018, EMNLP.

[30]  Cristian Danescu-Niculescu-Mizil,et al.  Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs , 2011, CMCL@ACL.

[31]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[32]  Justus J. Randolph Free-Marginal Multirater Kappa (multirater K[free]): An Alternative to Fleiss' Fixed-Marginal Multirater Kappa. , 2005 .

[33]  Hang Li,et al.  Neural Responding Machine for Short-Text Conversation , 2015, ACL.

[34]  Rafael E. Banchs Movie-DiC: a Movie Dialogue Corpus for Research and Development , 2012, ACL.

[35]  Jianfeng Gao,et al.  A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.

[36]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[37]  Lyle H. Ungar,et al.  Analyzing Biases in Human Perception of User Age and Gender from Text , 2016, ACL.

[38]  Xiaoyu Wang,et al.  Exploring Personalized Neural Conversational Models , 2017, IJCAI.

[39]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[40]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[41]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[42]  Carolyn Penstein Rosé,et al.  Computational Sociolinguistics: A Survey , 2016, Computational Linguistics.

[43]  Xiaoyan Zhu,et al.  Assigning personality/identity to a chatting machine for coherent conversation generation , 2017, ArXiv.

[44]  Marilyn A. Walker,et al.  PERSONAGE: Personality Generation for Dialogue , 2007, ACL.

[45]  Carsten Brockmann,et al.  Perceptions of Alignment and Personality in Generated Dialogue , 2012, INLG.

[46]  Jianfeng Gao,et al.  A Persona-Based Neural Conversation Model , 2016, ACL.

[47]  Mari Ostendorf,et al.  Low-Rank RNN Adaptation for Context-Aware Language Modeling , 2017, TACL.