Evaluating Commercial MT Systems
暂无分享,去创建一个
Vendors of commercial machine translation systems will often claim that their system can increase translator productivity x-fold. In order to verify such claims, we need to answer the following two questions: First, how is translator productivity generally measured? And second, precisely how does one go about comparing human translator (HT) productivity with MT productivity? The answer to the first question is relatively straightforward, at least for translators that are part of a translation service: productivity is generally measured in terms of the number of words translated per unit of time. In fact, translators frequently have to meet production quotas – 1300 words per day, for example – and their promotion may be contingent upon producing a certain number of words per year. The answer to the second question is slightly more complicated and involves, I would suggest, the comparison of two production chains: one in which the human translator works in tandem with the MT system; and another in which he works alone, without the aid of the system. Now there are many ways for a human translator to actually produce his texts: he can write them out, or type them, dictate them or use a word processor. Most commercial MT systems, on the other hand, come bundled (or at least interface with) a word processor. My intuition six years ago, when I was asked to participate in a trial of the Weidner MicroCat system at the Canadian government’s Translation Bureau, was that the purported productivity gains reported by the vendor were at least partly attributable to the introduction of a word processor in place of more traditional modes of production. Be that as it may, it is surely important, when designing an MT trial, to attempt to isolate the contribution of the machine translation module to overall productivity, since this is what costs the most to develop and what justifies the hefty price tag, not the word processor.