Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model

Abstract-like text summarisation requires a means of producing novel summary sentences. In order to improve the grammaticality of the generated sentence, we model a global (sentence) level syntactic structure. We couch statistical sentence generation as a spanning tree problem in order to search for the best dependency tree spanning a set of chosen words. We also introduce a new search algorithm for this task that models argument satisfaction to improve the linguistic validity of the generated tree. We treat the allocation of modifiers to heads as a weighted bipartite graph matching (or assignment) problem, a well studied problem in graph theory. Using BLEU to measure performance on a string regeneration task, we found an improvement, illustrating the benefit of the spanning tree approach armed with an argument satisfaction model.

[1]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[2]  Kevin Knight,et al.  The Practical Value of N-Grams Is in Generation , 1998, INLG.

[3]  Daniel Marcu,et al.  A Phrase-Based HMM Approach to Document/Abstract Alignment , 2004, EMNLP.

[4]  Raymond J. Mooney,et al.  Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation , 2007, NAACL.

[5]  Michael Collins,et al.  A New Statistical Parser Based on Bigram Lexical Dependencies , 1996, ACL.

[6]  Daniel Marcu,et al.  Towards Developing Generation Algorithms for Text-to-Text Applications , 2005, ACL.

[7]  Srinivas Bangalore,et al.  Exploiting a Probabilistic Hierarchical Model for Generation , 2000, COLING.

[8]  Mirella Lapata,et al.  Using Semantic Roles to Improve Question Answering , 2007, EMNLP.

[9]  Stephen Wan,et al.  Global revision in summarisation : generating novel sentences with Prim's algorithm , 2007 .

[10]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  Srinivas Bangalore,et al.  Evaluation Metrics for Generation , 2000, INLG.

[13]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[14]  Michael Strube,et al.  Generating Constituent Order in German Clauses , 2007, ACL.

[15]  Stephen Wan,et al.  GLEU: Automatic Evaluation of Sentence-Level Fluency , 2007, ACL.

[16]  Chris Callison-Burch,et al.  Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[17]  Michael Strube,et al.  Sentence Fusion via Dependency Graph Compression , 2008, EMNLP.

[18]  Bob Carpenter,et al.  Head-Driven Parsing for Word Lattices , 2004, ACL.

[19]  Giorgio Satta,et al.  On the Complexity of Non-Projective Data-Driven Dependency Parsing , 2007, IWPT.

[20]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[21]  Mirella Lapata,et al.  Optimal Constituent Alignment with Edge Covers for Semantic Projection , 2006, ACL.

[22]  Mark Johnson,et al.  Transforming Projective Bilexical Dependency Grammars into efficiently-parsable CFGs with Unfold-Fold , 2007, ACL.