论文信息 - Source Code Authorship Attribution

Source Code Authorship Attribution

[1] Ian D. Watson,et al. An Introduction to Case-Based Reasoning , 1995, UK Workshop on Case-Based Reasoning.

[2] Jack Grieve,et al. Quantitative Authorship Attribution: An Evaluation of Techniques , 2007, Lit. Linguistic Comput..

[3] Dale Schuurmans,et al. Language independent authorship attribution using character level language models , 2003, Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - EACL '03.

[4] S. K. Robinson,et al. An empirical approach for detecting program similarity and plagiarism within a university programming environment , 1987 .

[5] Naeem Seliya,et al. Detecting outsourced student programming assignments , 2008 .

[6] Chris F. Kemerer,et al. An empirical validation of software cost estimation models , 1987, CACM.

[7] Elliot Soloway,et al. Learning to program = learning to construct mechanisms and explanations , 1986, CACM.

[8] Paul Clough,et al. Creating A Corpus of Plagiarised Academic Texts , 2009 .

[9] Jörg Kindermann,et al. Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[10] Mansur H. Samadzadeh,et al. Extraction of Java program fingerprints for software authorship identification , 2004, J. Syst. Softw..

[11] Stephen G. MacDonell,et al. Software forensics for discriminating between program authors using case-based reasoning, feedforward neural networks and multiple discriminant analysis , 1999, ICONIP'99. ANZIIS'99 & ANNES'99 & ACNN'99. 6th International Conference on Neural Information Processing. Proceedings (Cat. No.99EX378).

[12] Benno Stein,et al. Intrinsic Plagiarism Analysis with Meta Learning , 2007, PAN.

[13] Sergey Brin,et al. The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[14] K.W. Bowyer,et al. Experience using "MOSS" to detect cheating on programming assignments , 1999, FIE'99 Frontiers in Education. 29th Annual Frontiers in Education Conference. Designing the Future of Science and Engineering Education. Conference Proceedings (IEEE Cat. No.99CH37011.

[15] Sviatoslav Voloshynovskiy,et al. Multiclass classification based on binary classifiers: On coding matrix design, reliability and maximum number of classes , 2009 .

[16] 横山俊伸,et al. 海外出張報告 McMaster University , 2005 .

[17] Justin Zobel,et al. Entropy-Based Authorship Search in Large Document Collections , 2007, ECIR.

[18] S. B. Needleman,et al. A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[19] Patrick Juola,et al. Proving and Improving Authorship Attribution Technologies , 2004 .

[20] Patrick Brennan,et al. A Prototype for Authorship Attribution Studies , 2006, Lit. Linguistic Comput..

[21] Justin Zobel,et al. Passage retrieval revisited , 1997, SIGIR '97.

[22] Benno Stein,et al. Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07 , 2007, SIGF.

[23] Stefanos Gritzalis,et al. Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method , 2007, Int. J. Digit. EVid..

[24] Stephen G. MacDonell,et al. A Fuzzy Logic Approach to Computer Software Source Code Authorship Analysis , 1997, ICONIP.

[25] N. Littlestone. Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[26] Spiros Mancoridis,et al. A genetic algorithm for solving the binning problem in networked applications detection , 2007, 2007 IEEE Congress on Evolutionary Computation.

[27] Moshe Koppel,et al. Measuring Differentiability: Unmasking Pseudonymous Authors , 2007, J. Mach. Learn. Res..

[28] Justin Zobel,et al. Using Relative Entropy for Authorship Attribution , 2006, AIRS.

[29] Margaret Hamilton,et al. Software development marketplaces: implications for plagiarism , 2007 .

[30] Ian H. Witten,et al. Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[31] Gabriella Kazai. INitiative for the Evaluation of XML Retrieval , 2009, Encyclopedia of Database Systems.

[32] Robert J. Gaizauskas,et al. Building and annotating a corpus for the study of journalistic text reuse , 2002, LREC.

[33] Robert Bosch,et al. Separating Hyperplanes and the Authorship of the Disputed Federalist Papers , 1998 .

[34] Boumediene Belkhouche,et al. Plagiarism detection in software designs , 2004, ACM-SE 42.

[35] Spiros Mancoridis,et al. Using code metric histograms and genetic algorithms to perform author identification for software forensics , 2007, GECCO '07.

[36] J. Pennebaker,et al. PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES Words of Wisdom: Language Use Over the Life Span , 2003 .

[37] M. H. Halstead,et al. Natural laws controlling algorithm structure? , 1972, SIGP.

[38] Greg J. Michaelson,et al. Automatic analysis of functional program style , 1996, Proceedings of 1996 Australian Software Engineering Conference.

[39] Justin Zobel,et al. Efficient plagiarism detection for large code repositories , 2007 .

[40] Michael J. Wise,et al. YAP3: improved detection of similarities in computer program and other texts , 1996, SIGCSE '96.

[41] K. J. Ottenstein. An algorithmic approach to the detection and prevention of plagiarism , 1976, SGCS.

[42] Robert L. Glass. Special Feature: Software Theft , 1985, IEEE Software.

[43] Curtis R. Cook,et al. A taxonomy for programming style , 1990, CSC '90.

[44] F. Mosteller,et al. Inference in an Authorship Problem , 1963 .

[45] Martin D. S. Braine,et al. The Ontogeny of English Phrase Structure: The First Phase , 1963 .

[46] Eugene H. Spafford,et al. Authorship analysis: identifying the author of a program , 1997, Comput. Secur..

[47] Alan M. Frieze,et al. Min-Wise Independent Permutations , 2000, J. Comput. Syst. Sci..

[48] Vlado Keselj,et al. Detection of New Malicious Code Using N-grams Signatures , 2004, PST.

[49] Pavel Paclík,et al. Does SVM Really Scale Up to Large Bag of Words Feature Spaces? , 2007, IDA.

[50] Judithe Sheard,et al. Addressing student cheating: definitions and solutions , 2003, ACM SIGCSE Bull..

[51] Stephen G. MacDonell,et al. Software Metrics Data Analysis—Exploring the Relative Performance of Some Commonly Used Modeling Techniques , 1999, Empirical Software Engineering.

[52] Andrew Turpin,et al. Temporally Robust Software Features for Authorship Attribution , 2009, 2009 33rd Annual IEEE International Computer Software and Applications Conference.

[53] Harris Drucker,et al. Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[54] Andrian Marcus,et al. An information retrieval approach to concept location in source code , 2004, 11th Working Conference on Reverse Engineering.

[55] Ian H. Witten,et al. Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[56] Efstathios Stamatatos,et al. Source Code Authorship Analysis For Supporting the Cybercrime Investigation Process , 2010, Handbook of Research on Computational Forensics, Digital Crime, and Investigation.

[57] Andrei Z. Broder,et al. On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[58] Andrew Turpin,et al. Application of Information Retrieval Techniques for Source Code Authorship Attribution , 2009, DASFAA.

[59] Peter Vamplew,et al. An Anti-Plagiarism Editor for Software Development Courses , 2005, ACE.

[60] Efstathios Stamatatos. A survey of modern authorship attribution methods , 2009 .

[61] Chris F. Kemerer,et al. An Empirical Approach to Studying Software Evolution , 1999, IEEE Trans. Software Eng..

[62] Efstathios Stamatatos. Author Identification Using Imbalanced and Limited Training Texts , 2007 .

[63] Fred G. Harold. Experimental evaluation of program quality using external metrics , 1986 .

[64] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[65] Efstathios Stamatatos,et al. Webpage Genre Identification Using Variable-Length Character n-Grams , 2007 .

[66] Ian H. Witten,et al. WEKA: a machine learning workbench , 1994, Proceedings of ANZIIS '94 - Australian New Zealnd Intelligent Information Systems Conference.

[67] Andrei Z. Broder,et al. Identifying and Filtering Near-Duplicate Documents , 2000, CPM.

[68] James T. Neill,et al. Who cheats at university? A self-report study of dishonest academic behaviours in a sample of Australian university students , 2005 .

[69] Efstathios Stamatatos,et al. Computer-Based Authorship Attribution Without Lexical Measures , 2001, Comput. Humanit..

[70] John R. Anderson,et al. Learning to Program in LISP , 1984, Cogn. Sci..

[71] Hector Garcia-Molina,et al. SCAM: A Copy Detection Mechanism for Digital Documents , 1995, DL.

[72] Vlado Keselj,et al. N-gram-based detection of new malicious code , 2004, Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004..

[73] Michael Philippsen,et al. Finding Plagiarisms among a Set of Programs with JPlag , 2002, J. Univers. Comput. Sci..

[74] Eero Hyvönen,et al. CEUR Workshop Proceedings , 2008 .

[75] Christian S. Collberg,et al. Self-plagiarism in computer science , 2005, CACM.

[76] H. Altay Güvenir,et al. Classification by Voting Feature Intervals , 1997, ECML.

[77] Samuel L. Grier,et al. A tool that detects plagiarism in Pascal programs , 1981, SIGCSE '81.

[78] Alistair Moffat,et al. Rank-biased precision for measurement of retrieval effectiveness , 2008, TOIS.

[79] Fazli Can,et al. Change of Writing Style with Time , 2004, Comput. Humanit..

[80] Efstathios Stamatatos,et al. Automatic Authorship Attribution , 1999, EACL.

[81] Rong Zheng,et al. Authorship Analysis in Cybercrime Investigation , 2003, ISI.

[82] Fabrizio Sebastiani,et al. Machine learning in automated text categorisation: a survey , 1999 .

[83] Shlomo Argamon,et al. Style mining of electronic messages for multiple authorship discrimination: first results , 2003, KDD '03.

[84] Michelle Craig,et al. Plagiarism detection using feature-based neural networks , 2007, SIGCSE.

[85] Justin Zobel. Uni Cheats Racket: A Case Study in Plagiarism Investigation , 2004, ACE.

[86] Hinrich Schütze,et al. Introduction to information retrieval , 2008 .

[87] Bin Ma,et al. Chain letters & evolutionary histories. , 2003, Scientific American.

[88] Ying Zhao,et al. Authorship Attribution Via Combination of Evidence , 2007, ECIR.

[89] Erkki Sutinen,et al. Fast Plagiarism Detection System , 2005, SPIRE.

[90] Clark S. Lindsey,et al. JavaTech, an Introduction to Scientific and Technical Computing with Java , 2005 .

[91] Ahmad-Reza Sadeghi,et al. Advanced techniques for dispute resolving and authorship proofs on digital works , 2003, IS&T/SPIE Electronic Imaging.

[92] Stefanos Gritzalis,et al. Supporting the cybercrime investigation process: Effective discrimination of source code authors based on byte-level information , 2005, ICETE.

[93] Luis Gravano,et al. dSCAM: finding document copies across multiple databases , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[94] Sally S. Robinson,et al. An instructional aid for student programs , 1980, SIGCSE '80.

[95] Shlomo Argamon,et al. Automatically Categorizing Written Texts by Author Gender , 2002, Lit. Linguistic Comput..

[96] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[97] E. P. Schan,et al. Recommended C Style and Coding Standards , 1997 .

[98] Eugene H. Spafford,et al. The internet worm program: an analysis , 1989, CCRV.

[99] Bernard De Baets,et al. A Connectionist Fuzzy Case-Based Reasoning Model , 2006, MICAI.

[100] Eric Bauer,et al. An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[101] Fuchun Peng,et al. N-GRAM-BASED AUTHOR PROFILES FOR AUTHORSHIP ATTRIBUTION , 2003 .

[102] Cristian Grozea,et al. ENCOPLOT: Pairwise Sequence Matching in Linear Time Applied to Plagiarism Detection ∗ , 2009 .

[103] Anat Rachel Shimoni,et al. Gender, genre, and writing style in formal written texts , 2003 .

[104] J. Howard Johnson,et al. Identifying redundancy in source code using fingerprints , 1993, CASCON.

[105] Thomas Lavergne. Unnatural language detection , 2006, CORIA.

[106] Benno Stein,et al. Intrinsic Plagiarism Detection , 2006, ECIR.

[107] Sriram Raghavan,et al. Searching the Web , 2001, ACM Trans. Internet Techn..

[108] John D. Burger,et al. An Exploration of Observable Features Related to Blogger Age , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[109] Spiros Mancoridis,et al. On the Use of Discretized Source Code Metrics for Author Identification , 2009, 2009 1st International Symposium on Search Based Software Engineering.

[110] Gilad Mishne,et al. Source Code Retrieval using Conceptual Similarity , 2004, RIAO.

[111] Patrick Juola,et al. Authorship Attribution , 2008, Found. Trends Inf. Retr..

[112] Roland H. Untch,et al. A small and secure submission system for UNIX systems , 2005, ACM-SE 43.

[113] Robert Parry,et al. Third degree. , 1997, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[114] Gregory A. Hall,et al. Toward Defining the Intersection of Forensics and Information Technology , 2005, Int. J. Digit. EVid..

[115] Diana Inkpen,et al. Using the Complexity of the Distribution of Lexical Elements as a Feature in Authorship Attribution , 2008, LREC.

[116] Glenn Gamst,et al. Applied Multivariate Research: Design and Interpretation , 2005 .

[117] Alan Nash,et al. The Elements of C Programming Style , 1992 .

[118] Arthur M. Lesk,et al. Introduction to bioinformatics , 2002 .

[119] Alberto Barrón-Cedeño,et al. Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance , 2009, CICLing.

[120] Maxim Mozgovoy. Enhancing Computer-Aided Plagiarism Detection , 2008 .

[121] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[122] M Hamilton,et al. Educating students about plagiarism avoidance - A computer science perspective , 2004 .

[123] Stefanos Gritzalis,et al. Effective identification of source code authors using byte-level information , 2006, ICSE.

[124] Ann-Marie Lancaster,et al. A plagiarism detection system , 1981, SIGCSE '81.

[125] H. E. Dunsmore,et al. Software engineering metrics and models , 1986 .

[126] Kenneth J. Stevens,et al. The Introduction and Assessment of Three Teaching Tools (WebCT, Mindtrail, EVE) into a Post Graduate Course , 2002, J. Inf. Technol. Educ..

[127] Seyed M. M. Tahaghoghi,et al. Plagiarism detection across programming languages , 2006, ACSC.

[128] George M. Mohay,et al. Mining e-mail content for author identification forensics , 2001, SGMD.

[129] Alistair Moffat,et al. Exploring the similarity space , 1998, SIGF.

[130] Eugene H. Spafford,et al. The internet worm: crisis and aftermath , 1989 .

[131] Patrick Juola,et al. A Controlled-corpus Experiment in Authorship Identification by Cross-entropy , 2003 .

[132] Marcus A. Maloof,et al. Learning to detect malicious executables in the wild , 2004, KDD.

[133] Edsger W. Dijkstra,et al. Go to Statement Considered Harmful (Reprint) , 2002, Software Pioneers.

[134] Justin Zobel,et al. Effective and Scalable Authorship Attribution Using Function Words , 2005, AIRS.

[135] Stephen G. MacDonell,et al. IDENTIFIED (Integrated Dictionary-based Extraction of Non-language-dependent Token Information for Forensic Identification, Examination, and Discrimination): a dictionary-based system for extracting source code metrics for software forensics , 1998, Proceedings. 1998 International Conference Software Engineering: Education and Practice (Cat. No.98EX220).

[136] Lloyd A. Smith,et al. Practical feature subset selection for machine learning , 1998 .

[137] Steven Garcia,et al. RMIT University at TREC 2005: Terabyte and Robust Track , 2005, TREC.

[138] Justin Zobel,et al. Methods for Identifying Versioned and Plagiarized Documents , 2003, J. Assoc. Inf. Sci. Technol..

[139] Stephen G. MacDonell,et al. Applications of fuzzy logic to software metric models for development effort estimation , 1997, 1997 Annual Meeting of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.97TH8297).

[140] Sally Jo Cunningham,et al. Applications of machine learning in information retrieval , 1999 .

[141] Carl Eklund,et al. National Institute for Standards and Technology , 2009, Encyclopedia of Biometrics.

[142] Andrew Walenstein,et al. Malware phylogeny generation using permutations of code , 2005, Journal in Computer Virology.

[143] Elad Yom-Tov,et al. Serial Sharers: Detecting Split Identities of Web Authors , 2007, PAN.

[144] E. Eugene Schultz,et al. Beyond preliminary analysis of the WANK and OILZ worms: a case study of malicious code , 1993, Comput. Secur..

[145] Spiros Mancoridis,et al. A Probabilistic Approach to Source Code Authorship Identification , 2007, Fourth International Conference on Information Technology (ITNG'07).

[146] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.

[147] Chunju Tseng,et al. The Arizona IDMatcher: developing an identity matching tool for law enforcement , 2007, DG.O.

[148] Shlomo Argamon,et al. Computational methods in authorship attribution , 2009 .

[149] Alistair Moffat,et al. Inverted Index Compression Using Word-Aligned Binary Codes , 2004, Information Retrieval.

[150] Deborah G. Johnson,et al. Australian Computer Society Code of Ethics Project (Part 1) , 2004 .

[151] Ward E. Y. Elliott,et al. And then there were none: Winnowing the Shakespeare claimants , 1996, Comput. Humanit..

[152] Mehmet M. Dalkilic,et al. Using Compression to Identify Classes of Inauthentic Texts , 2006, SDM.

[153] Edward L. Jones. METRICS BASED PLAGIARISM MONITORING , 2001 .

[154] Thomas P. Way,et al. SNITCH: a software tool for detecting cut and paste plagiarism , 2006, SIGCSE '06.

[155] Anas N. Al-Rabadi,et al. A comparison of modified reconstructability analysis and Ashenhurst‐Curtis decomposition of Boolean functions , 2004 .

[156] Jean-Marc Jézéquel,et al. Design by Contract: The Lessons of Ariane , 1997, Computer.

[157] Justin Zobel,et al. Music Ranking Techniques Evaluated , 2000, ISMIR.

[158] Daniel S. Hirschberg,et al. A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[159] Stephen E. Robertson,et al. A probabilistic model of information retrieval: development and comparative experiments - Part 1 , 2000, Inf. Process. Manag..

[160] Stephen G. MacDonell,et al. IDENTIFIED: software authorship analysis with case-based reasoning , 1998 .

[161] Charlie Daly,et al. A Technique for Detecting Plagiarism in Computer Code , 2005, Comput. J..

[162] Stefanos Gritzalis,et al. Examining the significance of high-level programming features in source code author classification , 2008, J. Syst. Softw..

[163] Michael Gamon,et al. Obfuscating Document Stylometry to Preserve Author Anonymity , 2006, ACL.

[164] CHENGXIANG ZHAI,et al. A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[165] Stephen G. MacDonell,et al. Software Forensics: Extending Authorship Analysis Techniques to Computer Programs , 2002 .

[166] Cynthia A. Phillips,et al. Constructing Computer Virus Phylogenies , 1996, J. Algorithms.

[167] Curtis R. Cook,et al. A paradigm for programming style research , 1988, SIGP.

[168] Efstathios Stamatatos,et al. Automatic Text Categorization In Terms Of Genre and Author , 2000, CL.

[169] Daniel Shawcross Wilkerson,et al. Winnowing: local algorithms for document fingerprinting , 2003, SIGMOD '03.

[170] George Fernandez,et al. Weblearn : a common gateway interface ( CGI)-based enviroment for interactive learning , 2001 .