GENERATION : A SURVEY AND CLASSIFICATION OF THE EMPIRICAL LITERATURE

Natural Language Generation (NLG) is defined as the systematic approach for producing human understandable natural language text based on non-textual data or from meaning representations. This is a significant area which empowers human-computer interaction. It has also given rise to a variety of theoretical as well as empirical approaches. This paper intends to provide a detailed overview and a classification of the state-of-the-art approaches in Natural Language Generation. The paper explores NLG architectures and tasks classed under document planning, micro-planning and surface realization modules. Additionally, this paper also identifies the gaps existing in the NLG research which require further work in order to make NLG a widely usable technology.

[1]  Douglas E. Appelt,et al.  Planning English Sentences , 1988, Cogn. Sci..

[2]  Mirella Lapata,et al.  Aggregation via Set Partitioning for Natural Language Generation , 2006, NAACL.

[3]  Hendrik T. Macedo,et al.  Model driven development approach to natural language generation systems , 2010, SOEN.

[4]  Ehud Reiter,et al.  SumTime-Mousam: Configurable marine weather forecast generator , 2003 .

[5]  Ben E. Cline,et al.  Kalos - A System for Natural Language Generation with Revision , 1994, AAAI.

[6]  John F. Sowa,et al.  Conceptual graphs as a universal knowledge representation , 1992 .

[7]  David Garlan,et al.  Formal Modeling and Analysis of Software Architecture: Components, Connectors, and Events , 2003, SFM.

[8]  Barbara Di Eugenio,et al.  UIC-CSC: The Content Selection Challenge Entry from the University of Illinois at Chicago , 2013, ENLG.

[9]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[10]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[11]  Brian Plüss,et al.  Generating Natural Language Descriptions of Z Test Cases , 2010, INLG.

[12]  Kees van Deemter Generating Referring Expressions: Boolean Extensions of the Incremental Algorithm , 2002, CL.

[13]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[14]  Chris Mellish,et al.  A Reference Architecture for Natural Language Generation Systems , 2006, Natural Language Engineering.

[15]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[16]  Irene Langkilde-Geary,et al.  An Empirical Verification of Coverage and Correctness for a General-Purpose Sentence Generator , 2002, INLG.

[17]  Emiel Krahmer,et al.  Graph-Based Generation of Referring Expressions , 2003, CL.

[18]  Emiel Krahmer,et al.  Graphs and Spatial Relations in the Generation of Referring Expressions , 2013, ENLG.

[19]  Parma Nand,et al.  Real Text-CS - Corpus Based Domain Independent Content Selection Model , 2014, 2014 IEEE 26th International Conference on Tools with Artificial Intelligence.

[20]  J. I N Y U,et al.  Choosing the content of textual summaries of large time-series data sets , 2005 .

[21]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[22]  Alexander Koller,et al.  Referring Expressions as Formulas of Description Logic , 2008, INLG.

[23]  Josef van Genabith,et al.  Robust PCFG-Based Generation Using Automatically Acquired LFG Approximations , 2006, ACL.

[24]  Tong Wang,et al.  Near-synonym Lexical Choice in Latent Semantic Space , 2010, COLING.

[25]  Srinivas Bangalore,et al.  Corpus-Based Lexical Choice in Natural Language Generation , 2000, ACL.

[26]  John D. Kelleher,et al.  Incremental Generation of Spatial Referring Expressions in Situated Dialog , 2006, ACL.

[27]  Chris Mellish,et al.  Overview of the First Content Selection Challenge from Open Semantic Web Data , 2013, ENLG.

[28]  Eduard H. Hovy,et al.  Automated Discourse Generation Using Discourse Structure Relations , 1993, Artif. Intell..

[29]  Michael Zock,et al.  Architectures for Natural Language Generation: Problems and Perspectives , 1993, EWNLG.

[30]  Leo Wanner,et al.  Content selection from an ontology-based knowledge base for the generation of football summaries , 2011, ENLG.

[31]  Advaith Siddharthan,et al.  Complex Lexico-syntactic Reformulation of Sentences Using Typed Dependency Representations , 2010, INLG.

[32]  Jacques Robin,et al.  A Revision-Based Generation Architecture for Reporting Facts in their Historical Context , 1993 .

[33]  Sabine Geldof,et al.  CORAL: using natural language generation for navigational assistance , 2003 .

[34]  Kentaro Inui,et al.  Text Revision: A Model and Its Implementation , 1992, NLG.

[35]  Raquel Hervás,et al.  Case Retrieval Nets for Heuristic Lexicalization in Natural Language Generation , 2005, EPIA.

[36]  Laurence Danlos,et al.  EasyText: an Operational NLG System , 2011, ENLG.

[37]  Hemanth Sagar Bayyarapu Efficient algorithm for Context Sensitive Aggregation in Natural Language generation , 2011, RANLP.

[38]  Jun'ichi Tsujii,et al.  Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator , 2005, IWPT.

[39]  Regina Barzilay,et al.  Bootstrapping Lexical Choice via Multiple-Sequence Alignment , 2002, EMNLP.

[40]  Chris Mellish,et al.  Capturing the Interaction between Aggregation and Text Planning in Two Generation Systems , 2000, INLG.

[41]  Parma Nand,et al.  The Role of Linked Data in Content Selection , 2014, PRICAI.

[42]  Kees van Deemter,et al.  Content Selection Challenge - University of Aberdeen Entry , 2013, ENLG.

[43]  Andrew D. Walker,et al.  Investigation into Human Preference between Common and Unambiguous Lexical Substitutions , 2011, ENLG.

[44]  Kathleen McKeown,et al.  Text generation: using discourse strategies and focus constraints to generate natural language text , 1985 .

[45]  Karin Harbusch,et al.  Integrated Natural Language Generation with Schema-Tree Adjoining Grammars , 2002, CICLing.

[46]  Mariët Theune,et al.  The Narrator: NLG for digital storytelling , 2007, ENLG.

[47]  Raquel Hervás,et al.  Story plot generation based on CBR , 2004, Knowl. Based Syst..

[48]  Owen Rambow,et al.  Evaluating a Trainable Sentence Planner for a Spoken Dialogue Travel System , 2001 .

[49]  Renata Pontin de Mattos Fortes,et al.  Towards Brazilian Portuguese automatic text simplification systems , 2008, DocEng '08.

[50]  Jeff Z. Pan,et al.  Generating Referring Expressions with OWL2 , 2010, Description Logics.

[51]  Michael White CCG Chart Realization from Disjunctive Inputs , 2006, INLG.

[52]  Jim Hunter,et al.  Summarizing Neonatal Time Series Data , 2003, EACL.

[53]  Jim Hunter,et al.  Choosing words in computer-generated weather forecasts , 2005, Artif. Intell..

[54]  Ehud Reiter,et al.  Generating Readable Texts for Readers with Low Basic Skills , 2005, ENLG.

[55]  David D. McDonald,et al.  A Model of Revision in Natural Language Generation , 1986, ACL.

[56]  Somayajulu Sripada,et al.  Atlas.txt: Linking Geo-referenced Data to Text for NLG , 2007, ENLG.

[57]  David J. Weir,et al.  Modelling control in generation , 2007, ENLG.

[58]  Kentaro Inui,et al.  Text Simplification for Reading Assistance: A Project Note , 2003, IWP@ACL.

[59]  Marilyn A. Walker,et al.  SPoT: A Trainable Sentence Planner , 2001, NAACL.

[60]  Massimo Zancanaro,et al.  Generation of Video Documentaries from Discourse Structures , 2003, ENLG@EACL.

[61]  Kathleen McKeown,et al.  Statistical Acquisition of Content Selection Rules for Natural Language Generation , 2003, EMNLP.

[62]  Philipp Cimiano,et al.  Exploiting Ontology Lexica for Generating Natural Language Texts from RDF Data , 2013, ENLG.

[63]  Albert Gatt,et al.  SimpleNLG: A Realisation Engine for Practical Applications , 2009, ENLG.

[64]  L. Azzopardi,et al.  PuppyIR : Designing an Open Source Framework for Interactive Information Services for Children , 2009 .

[65]  Ielka van der Sluis,et al.  Generation of Referring Expressions: Assessing the Incremental Algorithm , 2012, Cogn. Sci..

[66]  Diana Inkpen A statistical model for near-synonym choice , 2007, TSLP.

[67]  Emiel Krahmer,et al.  Graphs and Booleans: on the Generation of Referring Expressions , 2008 .

[68]  A. Cardoso,et al.  Cross-Domain Analogy in Automated Text Generation , 2006 .

[69]  Joseph Bates,et al.  Integrated Natural Language Generation Systems , 1992, NLG.

[70]  James H. Martin,et al.  Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .

[71]  Marvin Minsky,et al.  A framework for representing knowledge , 1974 .

[72]  Hossein Saiedian,et al.  An evaluation of the impact of component-based architectures on software reusability , 2002, Inf. Softw. Technol..

[73]  Jim Hunter,et al.  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data , 2007, AIME.

[74]  Michael White,et al.  Efficient Realization of Coordinate Structures in Combinatory Categorial Grammar , 2006 .

[75]  Nikiforos Karamanis,et al.  Investigating Content Selection for Language Generation using Machine Learning , 2009, ENLG.

[76]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.

[77]  Kees van Deemter,et al.  From RAGS to RICHES: Exploiting the Potential of a Flexible Generation Architecture , 2001, ACL.

[78]  Albert Gatt,et al.  Automatic generation of natural language nursing shift summaries in neonatal intensive care: BT-Nurse , 2012, Artif. Intell. Medicine.

[79]  Madalina Croitoru,et al.  A Conceptual Graph Approach for the Generation of Referring Expressions , 2007, IJCAI.

[80]  James C. Lester,et al.  Narrative prose generation , 2001, Artif. Intell..