Meaning Representations in Statistical Word Alignment

As a testbed of statistical word aligner, we implemented the prototype of statistical word aligner by graphical models [2, 10]. The advantage of using graphica l method resides in its extensibility compared to the traditional approach for statistical word a lignment [3, 22, 14]. Although there are semi-supervised word aligner [6], we only talk about uns upervised word aligner [3, 22, 14]. The capabilities of this word aligner include that 1) it supp orts IBM / HMM models as well as tree-based Models [13], 2) it can extend easily to support MA P assignment-based decoder (Viterbi decoding [21] as well as posterior decoding [10]) in these mo dels [16, 15], 3) it can be used for the input in the lattice-based decoding [1, 4] which are reinter pr ted as the partial model selection, 4) it supports flexible on / off capability of random variables whi ch has advantageous in the lemma-based alignment [5] and the morpheme avoided alignment, and 4) it c an be used for the forced alignment [18]. This comes from the fact that the inference algorithms , such as sum-product and max-product algorithms, are not affected by the form of network structur es. Note that the traditional statistical word alignment is built purely counting the frequency of wor ds where syntax / semantics are not considered [11].

[1]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[2]  Richard Zens,et al.  Speech Translation by Confusion Network Decoding , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[3]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[4]  Masao Utiyama,et al.  Overview of the Patent Translation Task at the NTCIR-7 Workshop , 2008, NTCIR.

[5]  Andy Way,et al.  Multi-Word Expression-Sensitive Word Alignment , 2010 .

[6]  Taku Kudo,et al.  Clustering graphs by weighted substructure mining , 2006, ICML.

[7]  Yu Zhang,et al.  Statistical Machine Translation based on LDA , 2010, 2010 4th International Universal Communication Symposium.

[8]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[9]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[10]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[11]  Sadao Kurohashi,et al.  Statistical Phrase Alignment Model Using Dependency Relation Probability , 2010 .

[12]  Yanjun Ma,et al.  Tuning Syntactically Enhanced Word Alignment for Statistical Machine Translation , 2009, EAMT.

[13]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[14]  Ondrej Bojar,et al.  Automatic Translation Error Analysis , 2011, TSD.

[15]  Andy Way,et al.  Gap Between Theory and Practice: Noise Sensitive Word Alignment in Machine Translation , 2010, WAPA.

[16]  Smaranda Muresan,et al.  Generalizing Word Lattice Translation , 2008, ACL.

[17]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[18]  Lane Schwartz,et al.  Multi-Source Translation Methods , 2008, AMTA.

[19]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[20]  Alexander M. Fraser,et al.  Getting the Structure Right for Word Alignment: LEAF , 2007, EMNLP-CoNLL.

[21]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .