Stochastic Language Generation Using WIDL-Expressions and its Application in Machine Translation and Summarization

We propose WIDL-expressions as a flexible formalism that facilitates the integration of a generic sentence realization system within end-to-end language processing applications. WIDL-expressions represent compactly probability distributions over finite sets of candidate realizations, and have optimal algorithms for realization via interpolation with language model probability distributions. We show the effectiveness of a WIDL-based NLG system in two sentence realization tasks: automatic translation and headline generation.

[1]  Daniel Marcu,et al.  Natural language generation using an information-slim representation , 2006 .

[2]  Liang Zhou,et al.  Headline Summarization at ISI , 2003 .

[3]  Christian M. I. M. Matthiessen,et al.  Text Generation and Systemic-Functional Linguistics: Experiences from English and Japanese , 1992 .

[4]  Richard M. Schwartz,et al.  BBN/UMD at DUC-2004: Topiary , 2004 .

[5]  Srinivas Bangalore,et al.  Using TAGs, a Tree Model, and a Language Model for Generation , 2000, TAG+.

[6]  Kevin Knight,et al.  A foundation for general-purpose natural language generation: sentence realization using probabilistic models of language , 2003 .

[7]  Richard M. Schwartz,et al.  Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation , 2003, HLT-NAACL 2003.

[8]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[9]  Michael Gamon,et al.  An Overview of Amalgam: A Machine-learned Generation Module , 2002, INLG.

[10]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[11]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[12]  Ding Yuan,et al.  Natural language generation in the context of machine translation , 2002 .

[13]  Vasileios Hatzivassiloglou,et al.  Two-Level, Many-Paths Generation , 1995, ACL.

[14]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[15]  Daniel Marcu,et al.  Fast and optimal decoding for machine translation , 2004, Artif. Intell..

[16]  Daniel Marcu,et al.  Towards Developing Generation Algorithms for Text-to-Text Applications , 2005, ACL.

[17]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[18]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[19]  Radu Soricut Natural Language Generation for Text-to-Text Applications Using an Information-Slim Representation , 2005, AAAI.

[20]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[21]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[22]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[23]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[24]  Nizar Habash,et al.  Matador: a large-scale Spanish-English GHMT system , 2003, MTSUMMIT.

[25]  M. J. Nederhof,et al.  IDL-Expressions: A Formalism for Representing and Parsing Finite Languages in Natural Language Processing , 2004, J. Artif. Intell. Res..