The Internet boom in recent years has increased the interest in the field of plagiarism detection. A lot of documents are published on the Net everyday and anyone can access and plagiarize them. Of course, checking all cases of plagiarism manually is an unfeasible task. Therefore, it is necessary to create new systems that are able to automatically detect cases of plagiarism produced. In this paper, we introduce a new hybrid system for plagiarism detection which combines the advantages of the two main plagiarism detection techniques. This system consists of two analysis phases: the first phase uses an intrinsic detection technique which dismisses much of the text, and the second phase employs an external detection technique to identify the plagiarized text sections. With this combination we achieve a detection system which obtains accurate results and is also faster thanks to the prefiltering of the text.
[1]
Maurizio Vichi,et al.
Studies in Classification Data Analysis and knowledge Organization
,
2011
.
[2]
Alexander F. Gelbukh,et al.
PPChecker: Plagiarism Pattern Checker in Document Copy Detection
,
2006,
TSD.
[3]
Benno Stein,et al.
Plagiarism Detection Without Reference Collections
,
2006,
GfKl.
[4]
Thomas P. Way,et al.
SNITCH: a software tool for detecting cut and paste plagiarism
,
2006,
SIGCSE '06.
[5]
Mike Joy,et al.
Sentence-based natural language plagiarism detection
,
2004,
JERC.
[6]
Benno Stein,et al.
An Evaluation Framework for Plagiarism Detection
,
2010,
COLING.