Introduction Plagiarism is now acknowledged to pose a significant threat to academic integrity. There is a growing array of software packages to help address the problem. Most of these offer a string-oftext comparison. New to emerge are software packages and services to 'generate' assignments. Naturally there will be a cat and mouse game for a while and in the meantime academics need to be alert to the possibilities of academic malpractice via plagiarism and adopt appropriate and promising counter-measures, including the newly emerging algorithms to do fast conceptual analysis. One such emergent agent is the Normalised Word Vector (NWV) algorithm (Williams, 2006), which was originally developed for use in the Automated Essay Grading (AEG) domain. AEG is a relatively new technology which aims to score or grade essays at the level of expert humans. This is achieved by creating a mathematical representation of the semantic information in addition to checking spelling, grammar, and other more usual parameters associated with essay assessment. The mathematical representation is computed for each student essay and compared with a mathematical representation computed for the model answer. If we can represent the semantic content of an essay we are able to compare it to some standard model--hence determine a grade or assign an authenticity parameter relative to any given corpus; and create a persistent digital representation of the essay. AEG technology can be used for plagiarism detection because it processes the semantic information of student essays and creates a semantic footprint. Once a mathematical representation for all or parts of an essay is created it can be efficiently compared to other similarly constructed representations and facilitate plagiarism checking through semantic footprint comparison. The Plagiarism Problem The extent of plagiarism is indeed significant. Maurer et al. (2006) provide a thorough analysis of the plagiarism problem and possible solutions as they pertain to academia. They divide the solution strategies into three main categories. The most common method is based on document comparison in which a word for word check is made with each target document in a selected which could be the source of the copied material. Clearly this is language independent as one is essentially comparing character strings; it will also match misspellings. The selected set of document is usually all documents comprising assignment or paper submissions for a specific purpose. A second category is an expansion of the document check but where the set of target documents is 'everything' that is reachable on the internet and the candidate to be checked for is a characteristic paragraph or sentence rather than the entire document. The emergence of tools such as Google has made this type of check feasible. The third category mentioned by Maurer et al. is the use of stylometry, in which a language analysis algorithm compares the style of successive paragraphs and reports if a style change has occurred. This can be extended to analyzing prior documents by the same author and comparing the stylistic parameters of a succession of documents. However, the issue of plagiarism is not merely a matter for academics. Austrian journalist Josef Karner (2001) writes "Das Abschreiben ist der eigentliche Beruf des Dichters" ("Transcription is the virtual vocation of the poet"). Is then the poet essentially a professional plagiarist, taking others' ideas and presenting them in verse as his own and without attribution? This may be a rather extreme position to hold, but its consideration does point up interesting possibilities which the etymology of plagiarism may illuminate. As yet there is a paucity of statistics available to help us understand the extent of plagiarism. However a recent Canadian study (Kloda & Nicholson, 2005) has reported that one in three students admit to turning to plagiarism prior to graduation - serious enough one may think. …
[1]
Hermann A. Maurer,et al.
Plagiarism - A Survey
,
2006,
J. Univers. Comput. Sci..
[2]
Jeremy B. Williams,et al.
The plagiarism problem: are students entirely to blame?
,
2002,
ASCILITE.
[3]
Robert Williams.
The power of normalised word vectors for automatically grading essays
,
2006
.
[4]
J. Naisbitt.
Megatrends: Ten New Directions Transforming Our Lives
,
1982
.
[5]
Whatsisname.
Devil's Dictionary
,
1958
.
[6]
Lorie A. Kloda,et al.
Plagiarism detection software and academic integrity :the canadian perspective
,
2005
.