In this paper, we first argue that human translation references used to calculate MT evaluation scores such as BLEU need to be revised. This revision is time and resourceconsuming, so we propose, instead, using an inexpensive MT evaluation method which detects and counts examples of characteristic MT output, referred to herein as instances of machine-translationness, by performing Internet searches. The goal is to obtain a sketch of the quality of the output, which, on occasions, is sufficient for the purpose of the evaluation. Moreover, this evaluation method can be adapted to detect drawbacks of the system, in order to develop a new version, and can also be helpful for post-editing machinetranslated documents.
[1]
Antoni Oliver,et al.
A Grammar and Style Checker Based on Internet Searches
,
2004,
LREC.
[2]
Michael Gamon,et al.
A Machine Learning Approach to the Automatic Evaluation of Machine Translation
,
2001,
ACL.
[3]
Salim Roukos,et al.
Bleu: a Method for Automatic Evaluation of Machine Translation
,
2002,
ACL.
[4]
Lluís Padró,et al.
FreeLing 1.3: Syntactic and semantic services in an open-source NLP library
,
2006,
LREC.
[6]
Alex Kulesza,et al.
A learning approach to improving sentence-level MT evaluation
,
2004
.