Detecting Inappropriate Use of Free Online Machine Translation by Language Students. A Special Case of Plagiarism Detection

The ready availability of free online machine translation (MT) systems has given rise to a problem in the world of language teaching in that students – especially weaker ones – use free online MT to do their translation homework. Apart from the pedagogic implications, one question of interest is whether we can devise any techniques for automatically detecting such use. This paper reports an experiment which aims to address this particular problem, using methods from the broader world of computational stylometry, plagiarism detection, text reuse, and MT evaluation. A pilot experiment comparing ‘honest’ and ‘derived’ translations produced by 25 intermediate learners of Spanish, Italian and

[1]  K. J. Ottenstein An algorithmic approach to the detection and prevention of plagiarism , 1976, SGCS.

[2]  James A. Malcolm,et al.  Detecting Short Passages of Similar Text in Large Document Collections , 2001, EMNLP.

[3]  Pius ten Hacken Computers and translation: a translator's guide , 2004 .

[4]  Hector Garcia-Molina,et al.  Building a scalable and accurate copy detection mechanism , 1996, DL '96.

[5]  Ian M. Richmond DOING IT BACKWARDS: USING TRANSLATION SOFTWARE TO TEACH TARGET‐LANGUAGE GRAMMATICALITY , 1994 .

[6]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[7]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[8]  Geoffrey Sampson,et al.  Word frequency distributions , 2002, Computational Linguistics.

[9]  H. Somers Three perspectives on MT in the classroom , 2001, MTSUMMIT.

[10]  Paul Clough,et al.  Old and new challenges in automatic plagiarism detection , 2003 .

[11]  Jin Yang,et al.  SYSTRAN on AltaVista: A User Study on Real-Time Machine Translation on the Internet , 1998, AMTA.

[12]  HARDSCAPE proaucis,et al.  Tools of the trade , 1995, Nature.

[13]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[14]  Don D. Anderson Machine Translation As a Tool in Second Language Learning , 1995 .

[15]  Hector Garcia-Molina,et al.  Copy detection mechanisms for digital documents , 1995, SIGMOD '95.

[16]  Mary Flanagan,et al.  Two years online: experiences, challenges and trends , 1996, AMTA.