Classification with noisy labels : "Multiple Account" cheating detection in Open Online Courses

Massive Open Online Courses (MOOCs) have the potential to enhance socioeconomic mobility through education. Yet, the viability of this outcome largely depends on the reputation of MOOC certificates as a credible academic credential. I describe a cheating strategy that threatens this reputation and holds the potential to render the MOOC certificate valueless. The strategy, Copying Answers using Multiple Existences Online (CAMEO), involves a user who gathers solutions to assessment questions using one or more harvester accounts and then submits correct answers using one or more separate master accounts. To estimate a lower bound for CAMEO prevalence among 1.9 million course participants in 115 HarvardX and MITx courses, I introduce a filter-based CAMEO detection algorithm and use a small-scale experiment to verify CAMEO use with certainty. I identify preventive strategies that can decrease CAMEO rates and show evidence of their effectiveness in science courses. Because the CAMEO algorithm functions as a lower bound estimate, it fails to detect many CAMEO cheaters. As a novelty of this thesis, instead of improving the shortcomings of the CAMEO algorithm directly, I recognize that we can think of the CAMEO algorithm as a method for producing noisy predicted cheating labels. Then a solution to the more general problem of binary classification with noisy labels ( ~ P N learning) is a solution to CAMEO cheating detection. ~ P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate 1 for positive examples and 0 for negative examples. I propose Rank Pruning to solve ~ P ~N learning and the open problem of estimating the noise rates. Unlike prior solutions, Rank Pruning is efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. I prove Rank Pruning achieves consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. Rank Pruning achieves state-of-the-art noise rate estimation and F1, error, and AUC-PR on the MNIST and CIFAR datasets, regardless of noise rates. To highlight, Rank Pruning with a CNN classifier can predict if a MNIST digit is a one or not one with only 0:25% error, and 0:46% error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples. Rank Pruning achieves similarly impressive results when as large as 50% of training examples are actually just noise drawn from a third distribution. Together, the CAMEO and Rank Pruning algorithms allow for a robust, general, and time-efficient solution to the CAMEO cheating detection problem. By ensuring the validity of MOOC credentials, we enable MOOCs to achieve both openness and value, and thus take one step closer to the greater goal of democratization of education.

[1]  Emiliano Miluzzo,et al.  A survey of mobile phone sensing , 2010, IEEE Communications Magazine.

[2]  Rong Jin,et al.  Multiple Kernel Learning from Noisy Labels by Stochastic Programming , 2012, ICML.

[3]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[4]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[5]  Nathan Intrator,et al.  Bootstrapping with Noise: An Effective Regularization Technique , 1996, Connect. Sci..

[6]  John R. Anderson,et al.  Cognitive Tutors: Lessons Learned , 1995 .

[7]  Alexander G. Hauptmann,et al.  Massive Open Online Proctor: Protecting the Credibility of MOOCs certificates , 2015, CSCW.

[8]  S. G. Ponnambalam,et al.  Trends in Intelligent Robotics, Automation, and Manufacturing , 2012, Communications in Computer and Information Science.

[9]  Albert Fornells,et al.  A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[10]  Joseph Jay Williams,et al.  HarvardX and MITx: Two Years of Open Online Courses Fall 2012-Summer 2014 , 2015 .

[11]  Diane J. Prince,et al.  Comparisons of Proctored versus Non-Proctored Testing Strategies in Graduate Distance Education Curriculum. , 2011 .

[12]  Olivier Chapelle,et al.  Model Selection for Support Vector Machines , 1999, NIPS.

[13]  George Siemens,et al.  The MOOC model for digital practice , 2010 .

[14]  Donald L. Mccabe,et al.  Cheating: Why Students Do It and How We Can Help Them Stop. , 2001 .

[15]  Ian H. Witten,et al.  One-Class Classification by Combining Density and Class Probability Estimation , 2008, ECML/PKDD.

[16]  Gayle S. Christensen,et al.  The MOOC Phenomenon: Who Takes Massive Open Online Courses and Why? , 2013 .

[17]  Rasil Warnakulasooriya,et al.  Patterns, correlates, and reduction of homework copying , 2010 .

[18]  Richard G. Baraniuk,et al.  Bayesian pairwise collaboration detection in educational datasets , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[19]  Gilles Blanchard,et al.  Classification with Asymmetric Label Noise: Consistency and Maximal Denoising , 2013, COLT.

[20]  Alex B. Van Zant,et al.  “I can't lie to your face”: Minimal face-to-face interaction promotes honesty , 2014 .

[21]  David G. Rand,et al.  The online laboratory: conducting experiments in a real labor market , 2010, ArXiv.

[22]  E. Xing,et al.  Towards an Integration of Text and Graph Clustering Methods as a Lens for Studying Social Interaction in MOOCs , 2014 .

[23]  E. L. Lehmann,et al.  Theory of point estimation , 1950 .

[24]  G. O. Wesolowsky,et al.  Detecting excessive similarity in answers on multiple choice exams , 2000 .

[25]  Giora Alexandron,et al.  Evidence of MOOC Students Using Multiple Accounts to Harvest Correct Answers , 2015 .

[26]  Justin Reich,et al.  HarvardX and MITx: The First Year of Open Online Courses, Fall 2012-Summer 2013 , 2014 .

[27]  Edward Cutrell,et al.  Measuring and Maximizing the Effectiveness of Honor Codes in Online Courses , 2015, L@S.

[28]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[29]  James G. Mazoué,et al.  The MOOC Model: Challenging Traditional Education , 2014 .

[30]  Pietro Perona,et al.  Pruning training sets for learning of object categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[31]  Xiaogang Wang,et al.  Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  K. Koedinger,et al.  Exploring the Assistance Dilemma in Experiments with Cognitive Tutors , 2007 .

[33]  Isaac L. Chuang,et al.  Detecting and preventing "multiple-account" cheating in massive open online courses , 2015, Comput. Educ..

[34]  Xiaoqian Jiang,et al.  Predicting accurate probabilities with a ranking loss , 2012, ICML.

[35]  Jukka Mäkelä,et al.  From learning to e-learning to m-learning to c-learning to …? , 2014, 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[36]  Jean-Philippe Vert,et al.  A bagging SVM to learn from positive and unlabeled examples , 2010, Pattern Recognit. Lett..

[37]  Dana Angluin,et al.  Learning from noisy examples , 1988, Machine Learning.

[38]  Linda Klebe Trevino,et al.  Academic Integrity in Honor Code and Non-Honor Code Environments: A Qualitative Investigation , 1999 .

[39]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[40]  Robert J. Kauffman,et al.  The effects of shilling on final bid prices in online auctions , 2005, Electron. Commer. Res. Appl..

[41]  Edward Cutrell,et al.  Deterring Cheating in Online Environments , 2015, TCHI.

[42]  Hangjung Zo,et al.  Understanding the MOOCs continuance: The role of openness and reputation , 2015, Comput. Educ..

[43]  James A. Wollack,et al.  A Nominal Response Model Approach for Detecting Answer Copying , 1997 .

[44]  Rob Phillips,et al.  Something for everyone: MOOC Design for informing dementia education and research , 2013 .

[45]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[46]  Damminda Alahakoon,et al.  Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[47]  Alexander Gammerman,et al.  Learning by Transduction , 1998, UAI.

[48]  L. Treviño,et al.  Academic Dishonesty: Honor Codes and Other Contextual Influences , 1993 .

[49]  Justin Reich,et al.  Socioeconomic status and MOOC enrollment: enriching demographic information with external datasets , 2015, LAK.

[50]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[51]  Steve Kolowich Behind the Webcam's Watchful Eye, Online Proctoring Takes Hold. , 2013 .

[52]  Niels Provos,et al.  A Virtual Honeypot Framework , 2004, USENIX Security Symposium.

[53]  Matthew Zook Your Urgent Assistance is Requested: The Intersection of 419 Spam and New Networks of Imagination , 2007 .

[54]  Andrew D. Ho,et al.  Changing “Course” , 2014 .

[55]  Dacheng Tao,et al.  Classification with Noisy Labels by Importance Reweighting , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  K. Worsley An improved Bonferroni inequality and applications , 1982 .

[57]  David F. Mastin,et al.  Online Academic Integrity , 2009 .

[58]  Richard G. Baraniuk,et al.  Tag-Aware Ordinal Sparse Factor Analysis for Learning and Content Analytics , 2014, EDM.

[59]  W. Gilks Markov Chain Monte Carlo , 2005 .

[60]  Michele Colajanni,et al.  HoneySpam: Honeypots Fighting Spam at the Source , 2005, SRUTI.

[61]  Clayton Scott,et al.  A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels , 2015, AISTATS.

[62]  Malik Yousef,et al.  One-Class SVMs for Document Classification , 2002, J. Mach. Learn. Res..

[63]  Aditya Simha Cheating in College—Why Students Do It and What Educators Can Do About It , 2014 .

[64]  Gregory B. Northcraft,et al.  To be or not to be trusted: The influence of media richness on defection and deception , 2008 .

[65]  Felix C. Freiling,et al.  The Nepenthes Platform: An Efficient Approach to Collect Malware , 2006, RAID.

[66]  Deborah A. Raines,et al.  Cheating in Online Courses: The Student Definition. , 2011 .

[67]  M. M. Moya,et al.  One-class classifier networks for target recognition applications , 1993 .

[68]  Johan A. K. Suykens,et al.  A robust ensemble approach to learn from positive and unlabeled data using SVM base models , 2014, Neurocomputing.

[69]  Manuel Blum,et al.  Time Bounds for Selection , 1973, J. Comput. Syst. Sci..

[70]  James A. Wollack,et al.  DETECTING ANSWER COPYING ON HIGH-STAKES TESTS , 2004 .

[71]  D. Nicol,et al.  Formative assessment and self‐regulated learning: a model and seven principles of good feedback practice , 2006 .

[72]  Frank M. LoSchiavo,et al.  The Impact of an Honor Code on Cheating in Online Courses , 2011 .

[73]  Richard G. Baraniuk,et al.  Collaboration-Type Identification in Educational Datasets. , 2014, EDM 2014.

[74]  Ryan Shaun Joazeiro de Baker,et al.  Detecting Student Misuse of Intelligent Tutoring Systems , 2004, Intelligent Tutoring Systems.

[75]  Marion Waite,et al.  Learning in a Small, Task-Oriented, Connectivist MOOC: Pedagogical Issues and Implications for Higher Education. , 2013 .

[76]  V. Shute Focus on Formative Feedback , 2008 .

[77]  Bing Liu,et al.  Learning with Positive and Unlabeled Examples Using Weighted Logistic Regression , 2003, ICML.

[78]  C. Hoxby,et al.  The Economics of Online Postsecondary Education: Moocs, Nonselective Education, and Highly Selective Education , 2014 .

[79]  Gilles Blanchard,et al.  Semi-Supervised Novelty Detection , 2010, J. Mach. Learn. Res..

[80]  Carolyn Penstein Rosé,et al.  Tutorial Dialogue as Adaptive Collaborative Learning Support , 2007, AIED.

[81]  Anna N. Rafferty,et al.  Computer-Guided Inquiry to Improve Science Learning , 2014, Science.

[82]  Philip S. Yu,et al.  Building text classifiers using positive and unlabeled examples , 2003, Third IEEE International Conference on Data Mining.

[83]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[84]  Kamal Nigam,et al.  Understanding the Behavior of Co-training , 2000, KDD 2000.

[85]  Panagiotis G. Ipeirotis,et al.  Running Experiments on Amazon Mechanical Turk , 2010, Judgment and Decision Making.

[86]  Michael F. Young,et al.  Digital plagiarism: An experimental study of the effect of instructional goals and copy-and-paste affordance , 2015, Comput. Educ..

[87]  Sarah Kellogg,et al.  Online learning: How to make a MOOC , 2013, Nature.

[88]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[89]  Georgia Kosmopoulou,et al.  Auctions with shill bidding , 2004 .

[90]  Ryan Shaun Joazeiro de Baker,et al.  Off-task behavior in the cognitive tutor classroom: when students "game the system" , 2004, CHI.

[91]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[92]  James J. Jiang A Literature Survey on Domain Adaptation of Statistical Classifiers , 2007 .

[93]  Deborah A. Fields,et al.  Cheating in virtual worlds: transgressive designs for learning , 2009 .

[94]  J. Reich,et al.  Democratizing education? Examining access and usage patterns in massive open online courses , 2015, Science.