论文信息 - From Parallel to Comparable Text Corpora

From Parallel to Comparable Text Corpora

We present a bilingual corpus management system under development in Pisa. The first component of this system was a set of procedures to create and query parallel text archives; we are now studying the implementation of a second set of procedures to interrogate comparable archives. The approach followed is quite different from that used for parallel data and considerably more complex; the results are also very different. In the paper, we describe the strategy we are adopting to retrieve significant data from comparable corpora, and discuss the preliminary results.

Carol Peters | Eugenio Picchi

[1] R. R. K. Hartmann,et al. The Use of Parallel Text Corpora in the Generation of Translation Equivalents for Bilingual Lexicography , 1994 .

[2] Kenneth Ward Church,et al. Identifying word correspondence in parallel texts , 1991 .

[3] Kenneth Ward Church,et al. Identifying Word Correspondences in Parallel Texts , 1991, HLT.

[4] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[5] J. Laffling. On Constructing a Transfer Dictionary for Man and Machine , 1992 .