Joint Morphological Generation and Syntactic Linearization

There has been growing interest in stochastic methods to natural language generation (NLG). While most NLG pipelines separate morphological generation and syntactic linearization, the two tasks are closely related. In this paper, we study joint morphological generation and linearization, making use of word order and inflections information for both tasks and reducing error propagation. Experiments show that the joint method significantly outperforms a strong pipelined baseline (by 1.1 BLEU points). It also achieves the best reported result on the Generation Challenge 2011 shared task.

[1]  Alexander I. Rudnicky,et al.  Stochastic Language Generation for Spoken Dialogue Systems , 2000 .

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[4]  Wei He,et al.  Dependency Based Chinese Sentence Realization , 2009, ACL/IJCNLP.

[5]  Kevin Knight,et al.  Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[6]  Joakim Nivre,et al.  Joint Morphological and Syntactic Analysis for Richly Inflected Languages , 2013, TACL.

[7]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[8]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[9]  Anja Belz,et al.  The First Surface Realisation Shared Task: Overview and Evaluation Results , 2011, ENLG.

[10]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[11]  Stephen Clark,et al.  Syntax-Based Word Ordering Incorporating a Large-Scale Language Model , 2012, EACL.

[12]  C. Mellish,et al.  Instance-Based Natural Language Generation , 2010, NAACL.

[13]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[14]  Srinivas Bangalore,et al.  Evaluation Metrics for Generation , 2000, INLG.

[15]  Nancy Ide,et al.  MULTEXT: Multilingual Text Tools and Corpora , 1994, COLING.

[16]  Jacob Andreas,et al.  Semantics-Based Machine Translation with Hyperedge Replacement Grammars , 2012, COLING.

[17]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[18]  Weiwei Sun,et al.  A Stacked Sub-Word Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging , 2011, ACL.

[19]  Michael Strube,et al.  Tree Linearization in English: Improving Language Model Based Approaches , 2009, NAACL.

[20]  Leo Wanner,et al.  Broad Coverage Multilingual Deep Sentence Generation with a Stochastic Multi-Level Realizer , 2010, COLING.

[21]  Qun Liu,et al.  A novel dependency-to-string model for statistical machine translation , 2011, EMNLP.

[22]  Yue Zhang Partial-Tree Linearization: Generalized Word Ordering for Text Synthesis , 2013, IJCAI.

[23]  Stephen Clark,et al.  Joint Word Segmentation and POS Tagging Using a Single Perceptron , 2008, ACL.

[24]  Michael White,et al.  Hypertagging: Supertagging for Surface Realization with CCG , 2008, ACL.

[25]  Stephen Wan,et al.  Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model , 2009, EACL.

[26]  Michael Strube,et al.  Generating Constituent Order in German Clauses , 2007, ACL.

[27]  Michael White,et al.  Reining in CCG Chart Realization , 2004, INLG.

[28]  Yue Zhang,et al.  Chinese Parsing Exploiting Characters , 2013, ACL.

[29]  Qun Liu,et al.  A Dependency Treelet String Correspondence Model for Statistical Machine Translation , 2007, WMT@ACL.

[30]  Qun Liu,et al.  A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging , 2008, ACL.

[31]  Christopher D. Manning,et al.  Joint Parsing and Named Entity Recognition , 2009, NAACL.

[32]  Jun'ichi Tsujii,et al.  Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese , 2012, ACL.

[33]  Srinivas Bangalore,et al.  Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction , 2007, ACL.

[34]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[35]  Karthik Gali,et al.  Sentence Realisation from Bag of Words with Dependency Constraints , 2009, HLT-NAACL.

[36]  Stephen Clark,et al.  Syntax-Based Grammaticality Improvement using CCG and Guided Search , 2011, EMNLP.