UFRGS@PAN2010: Detecting External Plagiarism - Lab Report for PAN at CLEF 2010

This paper presents our approach to detect plagiar ism in the PAN'10 competition. To accomplish this task we applied a method which aims at detect- ing external plagiarism cases. The method is specia lly designed to detect cross- language plagiarism and is composed by five phases: language normalization, retrieval of candidate documents, classifier traini ng, plagiarism analysis, and post-processing. Our group got the seventh place in the competition with an overall score of 0.5175. It is important to notice that the final score was affected by our low recall (0.4036) which arose as a result of not detecting intrinsic pla- giarism cases, which were also present in the compe tition corpus.