Empirical Realization Ranking

This thesis develops a new approach to the problem of indeterminacy in grammarbased natural language generation (NLG). The problem of indeterminacy concerns the fact that, for a given input semantic representation, the grammar might allow for several (i.e. thousands) alternative surface realizations. While the traditional approach to dealing with this problem is to rank the generated strings using a surface-oriented n-gram language model (LM), this thesis develops a linguistically informed approach based on features that are keyed to the internal structure of the realizations. The approach extends on the methodology previously used for statistical parsing and statistical unification-based grammars, and adapts it to the context of generation. This allows us to train treebank-based discriminative realization rankers based on modeling frameworks such as Maximum Entropy (MaxEnt) and Support Vector Machines (SVMs). The training data is based on the novel notion of a generation treebank, which we show how to automatically create on the basis of an existing parse-oriented treebank. For reference, we also develop an n-gram-based LM trained on a large corpus of raw text. Our experimental results show that the use of a discriminative model trained on just a few thousand items in a generation treebank, gives significantly better ranking performance than the use of a traditional surface-oriented LM. Moreover, we show that even better results can be obtained by combining the two modeling approaches. This is done by including the LM as an additional feature in the discriminative model. Evaluation scores are reported for several data sets and using a range of different automated metrics. We also include results for a manual evaluation carried out by a panel of external anonymous judges. The hybrid system for surface realization described in this thesis is currently integrated for target language generation in the Norwegian–English machine translation (MT) system LOGON. We also show how the realization ranker is used together with a global end-to-end reranking model for selecting the final output of the MT system.

[1]  H. Ziegel THE PRACTICAL VALUE OF NOGUCHI'S LUETIN REACTION , 1912 .

[2]  G. Zipf,et al.  The Psycho-Biology of Language , 1936 .

[3]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[4]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[5]  S. Siegel,et al.  Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[6]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[7]  Noam Chomsky,et al.  वाक्यविन्यास का सैद्धान्तिक पक्ष = Aspects of the theory of syntax , 1965 .

[8]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[9]  Actus primus,et al.  Measure for measure. , 1974, Nursing times.

[10]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[11]  H. Grice Logic and conversation , 1975 .

[12]  William C. Mann,et al.  Nigel: A Systemic Grammar for Text Generation. , 1983 .

[13]  Ivan A. Sag,et al.  Information-Based Syntax and Semantics: Volume 1, Fundamentals , 1987 .

[14]  Stuart M. Shieber,et al.  A Uniform Architecture for Parsing and Generation , 1988, COLING.

[15]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[16]  Anne Abeillé,et al.  A Lexicalized Tree Adjoining Grammar for English , 1990 .

[17]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[18]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[19]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[20]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[21]  Hiyan Alshawi,et al.  Monotonic Semantic Interpretation , 1992, ACL.

[22]  Uwe Reyle,et al.  Dealing with Ambiguities by Underspecification: Construction, Representation and Deduction , 1993, J. Semant..

[23]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[24]  Ronald M. Kaplan,et al.  The Interface between Phrasal and Functional Constraints , 1993, Comput. Linguistics.

[25]  Ivan A. Sag,et al.  Book Reviews: Head-driven Phrase Structure Grammar and German in Head-driven Phrase-structure Grammar , 1996, CL.

[26]  Jack Dongarra,et al.  PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing , 1995 .

[27]  Dan Flickinger,et al.  Translation using Minimal Recursion Semantics , 1995, TMI.

[28]  Vasileios Hatzivassiloglou,et al.  Two-Level, Many-Paths Generation , 1995, ACL.

[29]  Ronald Rosenfeld,et al.  The CMU Statistical Language Modeling Toolkit and its use in the 1994 ARPA CSR Evaluation , 1995 .

[30]  Johan Bos,et al.  Predicate logic unplugged , 1996 .

[31]  Lorna Balkan,et al.  TSNLP - Test Suites for Natural Language Processing , 1996, COLING.

[32]  Günther Görz,et al.  Towards understanding spontaneous speech: word accuracy vs. concept accuracy , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[33]  Martin Kay,et al.  Chart Generation , 1996, ACL.

[34]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[35]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[36]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[37]  Gertjan van Noord,et al.  Syntactic generation , 1997 .

[38]  Steven P. Abney Stochastic Attribute-Value Grammars , 1996, CL.

[39]  Eduard Hovy,et al.  Language generation: overview , 1997 .

[40]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[42]  David M. Carter,et al.  The TreeBanker: a Tool for Supervised Training of Parsed Corpora , 1997, ArXiv.

[43]  Dan Flickinger,et al.  Minimal Recursion Semantics: An Introduction , 2005 .

[44]  W. Wahlster VERBMOBIL : Erkennung, Analyse, Transfer, Generierung und Synthese von Spontansprache , 1997 .

[45]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[46]  Kevin Knight,et al.  Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[47]  Kevin Knight,et al.  The Practical Value of N-Grams Is in Generation , 1998, INLG.

[48]  Srinivas Bangalore,et al.  Automatic Acquisition of Hierarchical Transduction Models for Machine Translation , 1998, COLING-ACL.

[49]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[50]  Adam L. Berger,et al.  A Comparison of Criteria for Maximum Entropy/ Minimum Divergence Feature Selection , 1998, EMNLP.

[51]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[52]  Stephan Oepen,et al.  Towards systematic grammar profiling.Test suite technology 10 years after , 1998, Comput. Speech Lang..

[53]  Signe Oksefjell,et al.  A description of the English-Norwegian parallel corpus : Compilation and further developments , 1999 .

[54]  Mark Johnson,et al.  Estimators for Stochastic “Unification-Based” Grammars , 1999, ACL.

[55]  James Shaw,et al.  Ordering Among Premodifiers , 1999, ACL.

[56]  B. Schölkopf,et al.  Advances in kernel methods: support vector learning , 1999 .

[57]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[58]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[59]  Adwait Ratnaparkhi,et al.  Trainable Methods for Surface Natural Language Generation , 2000, ANLP.

[60]  David Carter The TreeBanker , 2000 .

[61]  Stefan Müller,et al.  HPSG Analysis of German , 2000 .

[62]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[63]  Stephan Oepen,et al.  Ambiguity Packing in Constraint-based Parsing Practical Results , 2000, ANLP.

[64]  Srinivas Bangalore,et al.  Evaluation Metrics for Generation , 2000, INLG.

[65]  Ronald Rosenfeld,et al.  A survey of smoothing techniques for ME models , 2000, IEEE Trans. Speech Audio Process..

[66]  Ralf Herbrich,et al.  Large margin rank boundaries for ordinal regression , 2000 .

[67]  Miles Osborne,et al.  Estimation of Stochastic Attribute-Value Grammars using an Informative Sample , 2000, COLING.

[68]  Mark Johnson,et al.  Exploiting auxiliary distributions in stochastic unification-based grammars , 2000, ANLP.

[69]  Rob Malouf,et al.  The Order of Prenominal Adjectives in Natural Language Generation , 2000, ACL.

[70]  Reinhard Blutner,et al.  Some Aspects of Optimality in Natural Language Interpretation , 2000, J. Semant..

[71]  Dan Flickinger,et al.  On building a more effcient grammar by exploiting types , 2000, Natural Language Engineering.

[72]  Irene Langkilde Forest-Based Statistical Sentence Generation , 2000, ANLP.

[73]  Henk Zeevat,et al.  The Asymmetry of Optimality Theoretic Syntax and Semantics , 2000, J. Semant..

[74]  Srinivas Bangalore,et al.  Exploiting a Probabilistic Hierarchical Model for Generation , 2000, COLING.

[75]  Kevin Humphreys,et al.  Reusing a Statistical Language Model for Generation , 2001, EWNLG@ACL.

[76]  John Carroll,et al.  An Efficient Chart Generator for (Semi-)Lexicalist Grammars , 2001 .

[77]  Srinivas Bangalore,et al.  Impact of Quality and Quantity of Corpora on Stochastic Generation , 2001, EMNLP.

[78]  Chris Mellish,et al.  Instance-based natural language generation , 2001, HTL 2001.

[79]  Stefan Schulz,et al.  Building a Large Knowledge Base Semi-Automatically , 2001, FLAIRS Conference.

[80]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[81]  Steven Abney,et al.  Statistical Methods and Linguistics , 2002 .

[82]  Thorsten Brants,et al.  The LinGO Redwoods Treebank: Motivation and Preliminary Applications , 2002, COLING.

[83]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[84]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[85]  Ann Copestake,et al.  Implementing typed feature structure grammars , 2001, CSLI lecture notes series.

[86]  Tsujii Jun'ichi,et al.  Maximum entropy estimation for feature forests , 2002 .

[87]  Emily M. Bender,et al.  Efficient Deep Processing of Japanese , 2002, ALR@COLING.

[88]  Irene Langkilde-Geary,et al.  An Empirical Verification of Coverage and Correctness for a General-Purpose Sentence Generator , 2002, INLG.

[89]  Mark Johnson,et al.  Parsing the Wall Street Journal using a Lexical-Functional Grammar and Discriminative Estimation Techniques , 2002, ACL.

[90]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[91]  Michael Gamon,et al.  An Overview of Amalgam: A Machine-learned Generation Module , 2002, INLG.

[92]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[93]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[94]  Stephan Oepen,et al.  Parse Disambiguation for a Rich HPSG Grammar , 2002 .

[95]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[96]  Eva Forsbom,et al.  Training a super model look-alike , 2003, MTSUMMIT.

[97]  Christopher D. Manning,et al.  The Leaf Projection Path View of Parse Trees: Exploring String Kernels for HPSG Parse Selection , 2004 .

[98]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[99]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[100]  Jan Tore Lønning,et al.  Som å kapp-ete med trollet? – Towards MRS-based Norwegian-English machine translation , 2004, TMI.

[101]  Michael White,et al.  Reining in CCG Chart Realization , 2004, INLG.

[102]  Eric Nichols,et al.  The Hinoki Treebank A Treebank for Text Understanding , 2004, IJCNLP.

[103]  Andy Way,et al.  Long-Distance Dependency Resolution in Automatically Acquired Wide-Coverage PCFG-Based LFG Approximations , 2004, ACL.

[104]  Robert Malouf,et al.  Wide Coverage Parsing with Stochastic Attribute Value Grammars , 2004 .

[105]  Anoop Sarkar,et al.  Discriminative Reranking for Machine Translation , 2004, NAACL.

[106]  Stephan Oepen,et al.  Measure for measure: towards increased component comparability and exchange , 2004 .

[107]  Stefan Riezler,et al.  Incremental Feature Selection and l1 Regularization for Relaxed Maximum-Entropy Modeling , 2004, EMNLP.

[108]  Stephan Oepen,et al.  LinGO Redwoods , 2004 .

[109]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[110]  Nizar Habash The Use of a Structural N-gram Language Model in Generation-Heavy Hybrid Machine Translation , 2004, INLG.

[111]  Ronald M. Kaplan,et al.  Lexical Functional Grammar A Formal System for Grammatical Representation , 2004 .

[112]  Jun'ichi Tsujii,et al.  Corpus-Oriented Grammar Development for Acquiring a Head-Driven Phrase Structure Grammar from the Penn Treebank , 2004, IJCNLP.

[113]  P. Smolensky,et al.  Optimality Theory: Constraint Interaction in Generative Grammar , 2004 .

[114]  Jun'ichi Tsujii,et al.  Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator , 2005, IWPT.

[115]  Anja Belz,et al.  Statistical Generation: Three Methods Compared and Evaluated , 2005, ENLG.

[116]  Filip Radlinski,et al.  Query chains: learning to rank from implicit feedback , 2005, KDD '05.

[117]  Stephan Oepen,et al.  Maximum Entropy Models for Realization Ranking , 2005 .

[118]  Stephan Oepen,et al.  SEM-I rational MT : enriching deep grammars with a semantic interface for scalable machine translation. , 2005 .

[119]  Stephan Oepen,et al.  Stochastic HPSG Parse Disambiguation using the Redwoods Corpus , 2005 .

[120]  Stephan Oepen,et al.  High Efficiency Realization for a Wide-Coverage Unification Grammar , 2005, IJCNLP.

[121]  Koenraad De Smedt,et al.  TREPIL: Developing Methods and Tools for Multilevel Treebank Construction , 2005 .

[122]  Michael White,et al.  Designing an Extensible API for Integrating Language Modeling and Realization , 2005, ACL 2005.

[123]  Berthold Crysmann,et al.  Relative Clause Extraposition in German: An Efficient and Portable Implementation , 2005 .

[124]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[125]  Stephan Oepen,et al.  Statistical Ranking in Tactical Generation , 2006, EMNLP.

[126]  Michael White,et al.  Learning to Say It Well: Reranking Realizations by Predicted Synthesis Quality , 2006, ACL.

[127]  Josef van Genabith,et al.  Robust PCFG-Based Generation Using Automatically Acquired LFG Approximations , 2006, ACL.

[128]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[129]  Stephan Oepen,et al.  Discriminant-Based MRS Banking , 2006, LREC.

[130]  Aoife Cahill,et al.  Stochastic Realisation Ranking for a Free Word Order Language , 2007, ENLG.

[131]  David I. Beaver,et al.  Lexical Variation in Relativizer Frequency , 2009 .

[132]  Stephan Busemann,et al.  Towards classification of generation subtasks , .