The multilingualization of legal documents is desirable for promoting the internationalization of the society. Since it is vital to choose proper terms when translating legal documents, which include technical terms and unique patterns, it is desirable to compile bilingual dictionaries for each legal domain. Compiling basic bilingual dictionaries for legal documents, however, is a difficult task because of the great range of legal documents. We describe a method for automatically extracting translation patterns for legal document translation by using legal documents and their translated documents. The proposed method extracts translation patterns with Japanese bunsetsu-level units from legal sentences and the translated sentences that are properly aligned with each other. The proposed method utilizes three indexes for pattern extraction: bilingual dictionaries, statistical co-occurrence information on the parallel corpus, and syntactic information based on dependency grammar. We have extracted translation patterns from the Japanese civil code and its translation. The result has provided 80.5% precision and 49.1% recall, and the extracted translation patterns will be useful for translating legal documents and helping to construct a Japanese-English legal dictionary.
[1]
Eugene Charniak,et al.
A Maximum-Entropy-Inspired Parser
,
2000,
ANLP.
[2]
Satoshi Sato,et al.
Finding Translation Correspondences from Parallel Parsed Corpus for Example-based Translation
,
2001
.
[3]
I. Dan Melamed,et al.
Models of translation equivalence among words
,
2000,
CL.
[4]
Yuji Matsumoto,et al.
Automatic Extraction of Word Sequence Correspondences in Parallel Corpora
,
1996,
VLC@COLING.
[5]
Yuji Matsumoto,et al.
Acquisition of Phrase-level Bilingual Correspondence using Dependency Structure
,
2000,
COLING.
[6]
Yuji Matsumoto,et al.
Japanese Dependency Analysis using Cascaded Chunking
,
2002,
CoNLL.
[7]
Yuji Matsumoto,et al.
Chunking with Support Vector Machines
,
2001,
NAACL.
[8]
Michael Collins,et al.
A New Statistical Parser Based on Bigram Lexical Dependencies
,
1996,
ACL.
[9]
Robert L. Mercer,et al.
The Mathematics of Statistical Machine Translation: Parameter Estimation
,
1993,
CL.