Constructing a Corpus that Indicates Patterns of Modification between Draft and Final Translations by Human Translators

In human translation, translators first make draft translations and then modify and edit them. In the case of experienced translators, this process involves the use of wide-ranging expert knowledge, which has mostly remained implicit so far. Describing the difference between draft and final translations, therefore, should contribute to making this knowledge explicit. If we could clarify the expert knowledge of translators, hopefully in a computationally tractable way, we would be able to contribute to the automatic notification of awkward translations to assist inexperienced translators, improving the quality of MT output, etc. Against this backdrop, we have started constructing a corpus that indicates patterns of modification between draft and final translations made by human translators. This paper reports on our progress to date.