Domain-Adaptable Hybrid Generation of RDF Entity Descriptions

RDF ontologies provide structured data on entities in many domains and continue to grow in size and diversity. While they can be useful as a starting point for generating descriptions of entities, they often miss important information about an entity that cannot be captured as simple relations. In addition, generic approaches to generation from RDF cannot capture the unique style and content of specific domains. We describe a framework for hybrid generation of entity descriptions, which combines generation from RDF data with text extracted from a corpus, and extracts unique aspects of the domain from the corpus to create domain-specific generation systems. We show that each component of our approach significantly increases the satisfaction of readers with the text across multiple applications and domains.

[1]  Ion Androutsopoulos,et al.  An Open-Source Natural Language Generator for OWL Ontologies and its Use in Protege and Second Life , 2009, EACL.

[2]  Leo Wanner,et al.  Natural Language Generation in the context of the Semantic Web , 2014, Semantic Web.

[3]  Kathleen McKeown,et al.  Statistical Acquisition of Content Selection Rules for Natural Language Generation , 2003, EMNLP.

[4]  Philipp Cimiano,et al.  Exploiting Ontology Lexica for Generating Natural Language Texts from RDF Data , 2013, ENLG.

[5]  Multi-adaptive Natural Language Generation using Principal Component Regression , 2014, INLG.

[6]  Dan Klein,et al.  A Simple Domain-Independent Probabilistic Approach to Generation , 2010, EMNLP.

[7]  Elena Lloret,et al.  Multi-genre summarization: Approach, potentials and challenges , 2015, eChallenges e-2015 Conference.

[8]  Daniel Duma,et al.  Generating Natural Language from Linked Data: Unsupervised template extraction , 2013, IWCS.

[9]  Chris Mellish,et al.  Domain Independent Sentence Generation from RDF Representations for the Semantic Web , 2006 .

[10]  Kathleen McKeown,et al.  Mining Paraphrasal Typed Templates from a Plain Text Corpus , 2016, ACL.

[11]  Marilyn A. Walker,et al.  Individual and Domain Adaptation in Sentence Planning for Dialogue , 2007, J. Artif. Intell. Res..

[12]  Alfio Gliozzo,et al.  An Entity-Focused Approach to Generating Company Descriptions , 2016, ACL.

[13]  Marilyn A. Walker,et al.  Generating Sentence Planning Variations for Story Telling , 2015, SIGDIAL Conference.

[14]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[15]  Oliver Lemon,et al.  Adaptive Referring Expression Generation in Spoken Dialogue Systems: Evaluation with Real Users , 2010, SIGDIAL Conference.

[16]  Oliver Lemon,et al.  Learning and Evaluation of Dialogue Strategies for New Applications: Empirical Methods for Optimization from Small Data Sets , 2011, CL.

[17]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[18]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[19]  Kathleen McKeown,et al.  Discourse Planning with an N-gram Model of Relations , 2015, EMNLP.

[20]  Blake Howald,et al.  A Statistical NLG Framework for Aggregated Planning and Realization , 2013, ACL.

[21]  Ion Androutsopoulos,et al.  Generating Natural Language Descriptions from OWL Ontologies: the NaturalOWL System , 2013, J. Artif. Intell. Res..

[22]  Mirella Lapata,et al.  Collective Content Selection for Concept-to-Text Generation , 2005, HLT.