Discriminative Syntax-Based Word Ordering for Text Generation

Word ordering is a fundamental problem in text generation. In this article, we study word ordering using a syntax-based approach and a discriminative model. Two grammar formalisms are considered: Combinatory Categorial Grammar (CCG) and dependency grammar. Given the search for a likely string and syntactic analysis, the search space is massive, making discriminative training challenging. We develop a learning-guided search framework, based on best-first search, and investigate several alternative training algorithms.The framework we present is flexible in that it allows constraints to be imposed on output word orders. To demonstrate this flexibility, a variety of input conditions are considered. First, we investigate a “pure” word-ordering task in which the input is a multi-set of words, and the task is to order them into a grammatical and fluent sentence. This task has been tackled previously, and we report improved performance over existing systems on a standard Wall Street Journal test set. Second, we tackle the same reordering problem, but with a variety of input conditions, from the bare case with no dependencies or POS tags specified, to the extreme case where all POS tags and unordered, unlabeled dependencies are provided as input (and various conditions in between). When applied to the NLG 2011 shared task, our system gives competitive results compared with the best-performing systems, which provide a further demonstration of the practical utility of our system.

[1]  Stephan Oepen,et al.  High Efficiency Realization for a Wide-Coverage Unification Grammar , 2005, IJCNLP.

[2]  Kenji Sagae,et al.  Dynamic Programming for Linear-Time Incremental Parsing , 2010, ACL.

[3]  Giorgio Satta,et al.  Guided Learning for Bidirectional Sequence Classification , 2007, ACL.

[4]  Robert Dale,et al.  Building applied natural language generation systems , 1997, Natural Language Engineering.

[5]  Daniel Marcu,et al.  Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[6]  Wei He,et al.  Dependency Based Chinese Sentence Realization , 2009, ACL/IJCNLP.

[7]  Stephen Clark,et al.  Syntax-Based Word Ordering Incorporating a Large-Scale Language Model , 2012, EACL.

[8]  Yoav Goldberg,et al.  An Efficient Algorithm for Easy-First Non-Directional Dependency Parsing , 2010, NAACL.

[9]  Brian Roark,et al.  Incremental Parsing with the Perceptron Algorithm , 2004, ACL.

[10]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[11]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[12]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[13]  Stephen Clark,et al.  Syntactic Processing Using the Generalized Perceptron and Beam Search , 2011, CL.

[14]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[15]  Yue Zhang,et al.  Joint Morphological Generation and Syntactic Linearization , 2014, AAAI.

[16]  Eugene Charniak,et al.  Figures of Merit for Best-First Probabilistic Chart Parsing , 1998, Comput. Linguistics.

[17]  Adam Lopez,et al.  Training a Log-Linear Parser with Loss Functions via Softmax-Margin , 2011, EMNLP.

[18]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[19]  Yang Guo,et al.  Structured Perceptron with Inexact Search , 2012, NAACL.

[20]  M. Yüksel,et al.  A Ph.D. Thesis , 2014 .

[21]  Michael White,et al.  Perceptron Reranking for CCG Realization , 2009, EMNLP.

[22]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[23]  Kristina Toutanova,et al.  A Discriminative Syntactic Word Order Model for Machine Translation , 2007, ACL.

[24]  Peng Xu,et al.  A Study on Richer Syntactic Dependencies for Structured Language Modeling , 2002, ACL.

[25]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[26]  Michael White,et al.  Efficient Realization of Coordinate Structures in Combinatory Categorial Grammar , 2006 .

[27]  David J. Weir,et al.  Characterizing mildly context-sensitive grammar formalisms , 1988 .

[28]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[29]  Stephen Clark,et al.  A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model , 2010, EMNLP.

[30]  Josef van Genabith,et al.  DCU at Generation Challenges 2011 Surface Realisation Track , 2011, ENLG.

[31]  John Carroll,et al.  An Efficient Chart Generator for (Semi-)Lexicalist Grammars , 2001 .

[32]  Yang Liu,et al.  A Shift-Reduce Parsing Algorithm for Phrase-based String-to-Dependency Translation , 2013, ACL.

[33]  Martin Kay,et al.  Chart Generation , 1996, ACL.

[34]  Stephanie Seneff,et al.  Automatic grammar correction for second-language learners , 2006, INTERSPEECH.

[35]  Julia Hockenmaier,et al.  Data and models for statistical parsing with combinatory categorial grammar , 2003 .

[36]  Richard Johansson,et al.  Extended Constituent-to-Dependency Conversion for English , 2007, NODALIDA.

[37]  Aravind K. Joshi,et al.  LTAG Dependency Parsing with Bidirectional Incremental Construction , 2008, EMNLP.

[38]  Michael Strube,et al.  Tree Linearization in English: Improving Language Model Based Approaches , 2009, NAACL.

[39]  Leo Wanner,et al.  Broad Coverage Multilingual Deep Sentence Generation with a Stochastic Multi-Level Realizer , 2010, COLING.

[40]  Mark Steedman,et al.  Wide-Coverage Semantic Representations from a CCG Parser , 2004, COLING.

[41]  Gerald Penn,et al.  Accurate Context-Free Parsing with Combinatory Categorial Grammar , 2010, ACL.

[42]  Michael White,et al.  Hypertagging: Supertagging for Surface Realization with CCG , 2008, ACL.

[43]  James R. Curran,et al.  Perceptron Training for a Wide-Coverage Lexicalized-Grammar Parser , 2007, ACL 2007.

[44]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[45]  Richard Johansson,et al.  The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies , 2008, CoNLL.

[46]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[47]  Yue Zhang Partial-Tree Linearization: Generalized Word Ordering for Text Synthesis , 2013, IJCAI.

[48]  Stephen Clark,et al.  Joint Word Segmentation and POS Tagging Using a Single Perceptron , 2008, ACL.

[49]  Stephen Wan,et al.  Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model , 2009, EACL.

[50]  Michael Strube,et al.  Generating Constituent Order in German Clauses , 2007, ACL.

[51]  Michael White,et al.  Reining in CCG Chart Realization , 2004, INLG.

[52]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[53]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[54]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[55]  William J. Byrne,et al.  Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices , 2010, COLING.

[56]  Anja Belz,et al.  The First Surface Realisation Shared Task: Overview and Evaluation Results , 2011, ENLG.

[57]  Mark Steedman,et al.  CCGbank: A Corpus of CCG Derivations and Dependency Structures Extracted from the Penn Treebank , 2007, CL.

[58]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[59]  Michael Collins,et al.  Efficient Third-Order Dependency Parsers , 2010, ACL.

[60]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.