论文信息 - Target-language-driven agglomerative part-of-speech tag clustering for machine translation

Target-language-driven agglomerative part-of-speech tag clustering for machine translation

This paper presents a method for reducing the set of dierent tags to be considered by a partof-speech tagger. The method is based on a clustering algorithm performed over the states of a hidden Markov model, which is initially trained by considering information not only from the source language, but also from the target language, using a new unsupervised technique which has been recently proposed to obtain taggers involved in machine translation systems. Then, a bottom-up agglomerative clustering algorithm groups the states of the hidden Markov model according to a similarity measure based on their transition probabilities; this reduces the complexity by grouping the initial finer tags into coarser ones. The experiments show that part-of-speech taggers using the coarser tags have smaller error rates than those using the initial finest tags; moreover, considering unsupervised information from the target language results in better clusters compared to those unsupervisedly built from source language information only.

[1] Stephen M. Omohundro,et al. Best-First Model Merging for Dynamic Learning and Recognition , 1991, NIPS.

[2] J. Bernardo,et al. Bayesian Hypothesis Testing: a Reference Approach , 2002 .

[3] Thorsten Brants. Estimating Markov model structures , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4] Kepa Sarasola,et al. An open-source shallow-transfer machine translation engine for the Romance languages of Spain , 2005, EAMT.

[5] Andreas Stolcke,et al. Best-first Model Merging for Hidden Markov Model Induction , 1994, ArXiv.

[6] Thorsten Brants. Tagset Reduction without Information Loss , 1995, ACL.

[7] O. Morgenthaler,et al. Proceedings of the Conference , 1930 .

[8] Mikel L. Forcada,et al. Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system , 2004 .

[9] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[10] Penelope Sibun,et al. A Practical Part-of-Speech Tagger , 1992, ANLP.

[11] Ananth Sankar,et al. HMM state clustering across allophone class boundaries , 1997, EUROSPEECH.

[12] R. A. Leibler,et al. On Information and Sufficiency , 1951 .

[13] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14] Kenneth Ward Church,et al. Poor Estimates of Context are Worse than None , 1990, HLT.

[15] Mikel L. Forcada,et al. Exploring the Use of Target-Language Information to Train the Part-of-Speech Tagger of Machine Translation Systems , 2004, EsTAL.