论文信息 - Learning-to-Translate Based on the S-SSTC Annotation Schema

Learning-to-Translate Based on the S-SSTC Annotation Schema

We present the S-SSTC framework for machine translation (MT), introduced in 2002 and developed since as a set of working MT systems (SiSTeC-ebmt). Our approach is example-based, but differs from other EBMT approaches in that it uses alignments of string-tree alignments, and in that supervised learning is an integral part of the approach. Our model directly deals with three main difficulties in the traditional treatment of MT that stem from its separation from the "translation task" (the 'world'). First, by allowing the system to learn from real translation examples directly, we avoid the need to indefinitely pursue the elusive goal of writing grammars to exactly describe intermediate syntacticosemantic monolingual representations and their correspondences. Second, we make explicit the dependence of the MT system performance on the input from the environment. That is possible only because the learning process uses feedback from the real translation knowledge when constructing its knowledge representation. Third, such MT systems using an inductively learned knowledge base yield a desirable non-regressive behavior by using translation mistakes to improve their knowledge base.

Enya Kong Tang | Christian Boitet | Yusoff Zaharin

[1] Victor Sadler,et al. Pilot Implementation of a Bilingual Knowledge Bank , 1990, COLING.

[2] Klaus Schubert,et al. Metataxis in Practice: Dependency Syntax for Multilingual Machine Translation , 1989 .

[3] Dan Roth,et al. Learning to reason , 1994, JACM.

[4] Mosleh Hmoud Al-Adhaileh,et al. A Synchronization Structure of SSTC and Its Applications in Machine Translation , 2002, COLING 2002.

[5] Arul Menezes,et al. Achieving commercial-quality translation with example-based methods , 2001, MTSUMMIT.

[6] Daisuke Kawahara,et al. Acquiring Reliable Predicate-argument Structures from Raw Corpora for Case Frame Compilation , 2010, LREC.

[7] Zaharin Yusoff,et al. Natural Languages Analysis in Machine Translation (MT) Based on the STCG (String-Tree Correspondence Grammar) , 1995 .

[8] Arul Menezes,et al. A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora , 2001, DDMMT@ACL.

[9] Christian Boitet,et al. Representation trees and string-tree correspondences , 1988, COLING.