Assessing the influence of personal preferences on the choice of vocabulary for natural language generation

Referring expression generation is the part of natural language generation that decides how to refer to the entities appearing in an automatically generated text. Lexicalization is the part of this process which involves the choice of appropriate vocabulary or expressions to transform the conceptual content of a referring expression into the corresponding text in natural language. This problem presents an important challenge when we have enough knowledge to allow more than one alternative. In those cases, we need some heuristics to decide which alternatives are more appropriate in a given situation. Whereas most work on natural language generation has focused on a generic way of generating language, in this paper we explore personal preferences as a type of heuristic that has not been properly addressed. We empirically analyze the TUNA corpus, a corpus of referring expression lexicalizations, to investigate the influence of language preferences in how people lexicalize new referring expressions in different situations. We then present two corpus-based approaches to solve the problem of referring expression lexicalization, one that takes preferences into account and one that does not. The results show a decrease of 50% in the similarity error against the reference corpus when personal preferences are used to generate the final referring expression.

[1]  Robert Dale,et al.  Generating Referring Expressions Involving Relations , 1991, EACL.

[2]  M. Clements,et al.  The influence of personalization on tag query length in social media search , 2010, Inf. Process. Manag..

[3]  Johan Bos,et al.  Proceedings of the 13th European Workshop on Natural Language Generation (ENLG) , 2011 .

[4]  C SchankRoger,et al.  Dynamic Memory: A Theory of Reminding and Learning in Computers and People , 1983 .

[5]  Thorsten Joachims,et al.  The influence of task and gender on search and evaluation behavior using Google , 2006, Inf. Process. Manag..

[6]  Graeme Hirst,et al.  Near-Synonymy and Lexical Choice , 2002, CL.

[7]  Mario Lenz,et al.  Case Retrieval Nets: Basic Ideas and Extensions , 1996, KI.

[8]  Bernd Bohnet Generation of Referring Expression with an Individual Imprint , 2009, ENLG.

[9]  Raquel Herv Evolutionary and Case-Based Approaches to REG: NIL-UCM-EvoTAP, NIL-UCM-ValuesCBR and NIL-UCM-EvoCBR , 2009 .

[10]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[13]  Marilyn A. Walker,et al.  Controlling User Perceptions of Linguistic Style: Trainable Generation of Personality Traits , 2011, CL.

[14]  Emiel Krahmer,et al.  Graph-Based Generation of Referring Expressions , 2003, CL.

[15]  Ielka van der Sluis,et al.  Building a Semantically Transparent Corpus for the Generation of Referring Expressions. , 2006, INLG.

[16]  Albert Gatt,et al.  Generating coherent references to multiple entities , 2007 .

[17]  Emiel Krahmer,et al.  Empirical Methods in Natural Language Generation: Data-oriented Methods and Empirical Evaluation , 2010, Empirical Methods in Natural Language Generation.

[18]  Srinivas Bangalore,et al.  Corpus-Based Lexical Choice in Natural Language Generation , 2000, ACL.

[19]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[20]  M. Pickering,et al.  Toward a mechanistic psychology of dialogue , 2004, Behavioral and Brain Sciences.

[21]  Anja Belz,et al.  An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems , 2009, CL.

[22]  Helmut Horacek,et al.  An Algorithm for Generating Referential Descriptions with Flexible Interfaces , 1997, ACL.

[23]  Raquel Hervás,et al.  Degree of Abstraction in Referring Expression Generation and its Relation with the Construction of the Contrast Set , 2008, INLG.

[24]  John D. Kelleher,et al.  Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR) , 2008, INLG.

[25]  Ielka van der Sluis,et al.  Evaluating algorithms for the Generation of Referring Expressions using a balanced corpus , 2007, ENLG.

[26]  Michael Elhadad,et al.  FUF: the Universal Unifier User Manual Version 5.2 , 1991 .

[27]  Agnar Aamodt,et al.  Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..

[28]  Roger Evans,et al.  Empirically-based Control of Natural Language Generation , 2005, ACL.

[29]  Emiel Krahmer,et al.  Efficient context-sensitive generation of referring expressions , 2002 .

[30]  Bernd Bohnet The Fingerprint of Human Referring Expressions and their Surface Realization with Graph Transducers (IS-FP, IS-GT, IS-FP-GT)} , 2008, INLG.

[31]  Amanda Stent,et al.  Lexical and Syntactic Adaptation and Their Impact in Deployed Spoken Dialog Systems , 2009, NAACL.

[32]  Michael Elhadad,et al.  Using argumentation to control lexical choice: a functional unification implementation , 1993 .

[33]  Srinivas Bangalore,et al.  Referring Expression Generation Using Speaker-based Attribute Selection and Trainable Realization (ATTR) , 2008, INLG.

[34]  Emiel Krahmer,et al.  Introducing shared task evaluation to NLG : The TUNA shared task evaluation challenges , 2010 .

[35]  Roger C. Schank,et al.  Dynamic memory - a theory of reminding and learning in computers and people , 1983 .

[36]  Raquel Hervás,et al.  Evolutionary and Case-Based Approaches to REG: NIL-UCM-EvoTAP, NIL-UCM-ValuesCBR and NIL-UCM-EvoCBR , 2009, ENLG.

[37]  Advaith Siddharthan,et al.  Generating Referring Expressions in Open Domains , 2004, ACL.

[38]  Albert Gatt,et al.  The TUNA-REG Challenge 2009: Overview and Evaluation Results , 2009, ENLG.

[39]  Ielka van der Sluis,et al.  Manual for TUNA Corpus: Referring Expressions in Two Domains , 2006 .

[40]  Emiel Krahmer,et al.  Context sensitive generation of descriptions , 1998, ICSLP.

[41]  Oliver Lemon,et al.  Learning Lexical Alignment Policies for Generating Referring Expressions for Spoken Dialogue Systems , 2009, ENLG.

[42]  Albert Gatt,et al.  The TUNA Challenge 2008: Overview and Evaluation Results , 2008, INLG.

[43]  Ielka van der Sluis,et al.  Xml format guidelines for the tuna corpus , 2008 .

[44]  Ehud Reiter,et al.  Should Corpora Texts Be Gold Standards for NLG? , 2002, INLG.

[45]  Chin-Yew Lin,et al.  Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics , 2004, ACL.

[46]  Albert Gatt,et al.  The attribute selection for GRE challenge: overview and evaluation results , 2007, MTSUMMIT.