Numerous methods have been developed for generating a machine translation (MT) bilingual dictionary from a parallel text corpus. Such methods extract bilingual collocations from sentence pairs of source and target language sentences. Then those collocations are registered in an MT bilingual dictionary. Bilingual collocations are lexically corresponding pairs of parts extracted from sentence pairs. This paper describes a new method for automatic extraction of bilingual collocations from a parallel text corpus using no linguistic knowledge. We use Recursive Chain-link-type Learning (RCL), which is a learning algorithm, to extract bilingual collocations. Our method offers two main advantages. One benefit is that this RCL system requires no linguistic knowledge. The other advantage is that it can extract many bilingual collocations, even if the frequency of appearance of the bilingual collocations is very low. Experimental results verify that our system extracts bilingual collocations efficiently. The extraction rate of bilingual collocations was 74.9% for all bilingual collocations that corresponded to nouns in the parallel corpus.
[1]
Hideki Hirakawa,et al.
Building An MT Dictionary From Parallel Texts Based On Linguistic And Statistical Information
,
1994,
COLING.
[2]
Vasileios Hatzivassiloglou,et al.
Translating Collocations for Bilingual Lexicons: A Statistical Approach
,
1996,
CL.
[3]
Hiroshi Echizen-ya,et al.
Machine Translation Method Using Inductive Learning with Genetic Algorithms
,
1996,
COLING.
[4]
Hiroshi Echizen-ya,et al.
Study of Practical Effectiveness for Machine Translation Using Recursive Chain-link-type Learning
,
2002,
COLING.
[5]
Kenji Araki,et al.
Performance Evaluation for Non - Segmented Kana - Kanji Translation Method Using Inductive Learning with Degenerated Keyword Input
,
1998
.