COUNTER: corpus of Urdu news text reuse
暂无分享,去创建一个
Paul Rayson | Rao Muhammad Adeel Nawab | Muhammad Sharjeel | Paul Rayson | R. M. A. Nawab | M. Sharjeel
[1] Justin Zobel,et al. Methods for Identifying Versioned and Plagiarized Documents , 2003, J. Assoc. Inf. Sci. Technol..
[2] W. Bruce Croft,et al. Finding text reuse on the web , 2009, WSDM '09.
[3] Tony McEnery,et al. Corpus Resources and Minority Language Engineering , 2000, LREC.
[4] James A. Malcolm,et al. Detecting Short Passages of Similar Text in Large Document Collections , 2001, EMNLP.
[5] G. Yule. ON SENTENCE- LENGTH AS A STATISTICAL CHARACTERISTIC OF STYLE IN PROSE: WITH APPLICATION TO TWO CASES OF DISPUTED AUTHORSHIP , 1939 .
[6] Dawn Archer,et al. Extracting Multiword Expressions with A Semantic Tagger , 2003, ACL 2003.
[7] อนิรุธ สืบสิงห์,et al. Data Mining Practical Machine Learning Tools and Techniques , 2014 .
[8] D. Thenmozhi,et al. Paraphrase Identification by Using Clause-Based Similarity Features and Machine Translation Metrics , 2016, Comput. J..
[9] Alberto Barrón-Cedeño,et al. Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance , 2009, CICLing.
[10] Naomie Salim,et al. Survey of Text Plagiarism Detection , 2012 .
[11] Nicholas Tran,et al. Sim: a utility for detecting similarity in computer programs , 1999, SIGCSE '99.
[12] Gerard Salton,et al. A vector space model for automatic indexing , 1975, CACM.
[13] Benno Stein,et al. An Evaluation Framework for Plagiarism Detection , 2010, COLING.
[14] W. Bruce Croft,et al. Local text reuse detection , 2008, SIGIR '08.
[15] Carlo Strapparava,et al. Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.
[16] Ralf Steinmetz,et al. Automatic Detection of Local Reuse , 2010, EC-TEL.
[17] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .
[18] Yiming Yang,et al. RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..
[19] Sergey Butakov,et al. The toolbox for local and global plagiarism detection , 2009, Comput. Educ..
[20] Hermann A. Maurer,et al. Plagiarism - A Survey , 2006, J. Univers. Comput. Sci..
[21] Alberto Barrón-Cedeño,et al. Plagiarism Meets Paraphrasing: Insights for the Next Generation in Automatic Plagiarism Detection , 2013, CL.
[22] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[23] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[24] Iryna Gurevych,et al. Text Reuse Detection using a Composition of Text Similarity Measures , 2012, COLING.
[25] Grigori Sidorov,et al. A Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014 , 2014, CLEF.
[26] Ehsan Ullah Munir,et al. Cross-Language Urdu-English (CLUE) Text Alignment Corpus: Notebook for PAN at CLEF 2015 , 2015, CLEF.
[27] Rui Sousa-Silva,et al. Detecting translingual plagiarism and the backlash against translation plagiarists , 2014 .
[28] Kashif Riaz,et al. A Study in Urdu Corpus Construction , 2002, ALR@COLING.
[29] Paolo Rosso,et al. Determining and characterizing the reused text for plagiarism detection , 2013, Expert Syst. Appl..
[30] Michael J. Wise. Detection of similarities in student programs: YAP'ing may be preferable to plague'ing , 1992, SIGCSE '92.
[31] W. Bruce Croft,et al. Evaluating text reuse discovery on the web , 2010, IIiX.
[32] Hector Garcia-Molina,et al. SCAM: A Copy Detection Mechanism for Digital Documents , 1995, DL.
[33] C. Lyon,et al. Demonstration of the Ferret Plagiarism Detector , 2006 .
[34] Kathleen McKeown,et al. The decomposition of human-written summary sentences , 1999, SIGIR '99.
[35] A. Bell. The language of news media , 1991 .
[36] Ophir Frieder,et al. Collection statistics for fast duplicate document detection , 2002, TOIS.
[37] Hector Garcia-Molina,et al. Copy detection mechanisms for digital documents , 1995, SIGMOD '95.
[38] Iraklis Varlamis,et al. Text Relatedness Based on a Word Thesaurus , 2010, J. Artif. Intell. Res..
[39] Tony McEnery,et al. A corpus-based approach to text reuse in the newsbooks of the Commonwealth , 2010 .
[40] W. Bruce Croft,et al. Similarity measures for tracking information flow , 2005, CIKM '05.
[41] Horacio Rodríguez,et al. Is This a Paraphrase? What Kind? Paraphrase Boundaries and Typology , 2014 .
[42] Jacob Cohen,et al. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .
[43] Mark Stevenson,et al. Developing a corpus of plagiarised short answers , 2011, Lang. Resour. Evaluation.
[44] Efstathios Stamatatos,et al. Plagiarism detection using stopword n-grams , 2011, J. Assoc. Inf. Sci. Technol..
[45] Waqas Anwar,et al. A Survey of Automatic Urdu Language Processing , 2006, 2006 International Conference on Machine Learning and Cybernetics.
[46] Per Runeson,et al. Detection of Duplicate Defect Reports Using Natural Language Processing , 2007, 29th International Conference on Software Engineering (ICSE'07).
[47] Yorick Wilks,et al. The METER corpus : a corpus for analysing journalistic text reuse , 2001 .
[48] Tanya Aplin. Reflections on Measuring Text Re-Use from a Copyright Law Perspective , 2010 .
[49] Benno Stein,et al. PAN Plagiarism Corpus PAN-PC-09 , 2009 .
[50] Yorick Wilks,et al. Measuring Text Reuse , 2002, ACL.
[51] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).
[52] James H. Martin,et al. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition , 2000 .
[53] Matthias Hagen,et al. Overview of the 1st international competition on plagiarism detection , 2009 .