论文信息 - Classification with noisy labels : "Multiple Account" cheating detection in Open Online Courses

Classification with noisy labels : "Multiple Account" cheating detection in Open Online Courses

Massive Open Online Courses (MOOCs) have the potential to enhance socioeconomic mobility through education. Yet, the viability of this outcome largely depends on the reputation of MOOC certificates as a credible academic credential. I describe a cheating strategy that threatens this reputation and holds the potential to render the MOOC certificate valueless. The strategy, Copying Answers using Multiple Existences Online (CAMEO), involves a user who gathers solutions to assessment questions using one or more harvester accounts and then submits correct answers using one or more separate master accounts. To estimate a lower bound for CAMEO prevalence among 1.9 million course participants in 115 HarvardX and MITx courses, I introduce a filter-based CAMEO detection algorithm and use a small-scale experiment to verify CAMEO use with certainty. I identify preventive strategies that can decrease CAMEO rates and show evidence of their effectiveness in science courses. Because the CAMEO algorithm functions as a lower bound estimate, it fails to detect many CAMEO cheaters. As a novelty of this thesis, instead of improving the shortcomings of the CAMEO algorithm directly, I recognize that we can think of the CAMEO algorithm as a method for producing noisy predicted cheating labels. Then a solution to the more general problem of binary classification with noisy labels ( ~ P N learning) is a solution to CAMEO cheating detection. ~ P N learning is the problem of binary classification when training examples may be mislabeled (flipped) uniformly with noise rate 1 for positive examples and 0 for negative examples. I propose Rank Pruning to solve ~ P ~N learning and the open problem of estimating the noise rates. Unlike prior solutions, Rank Pruning is efficient and general, requiring O(T) for any unrestricted choice of probabilistic classifier with T fitting time. I prove Rank Pruning achieves consistent noise estimation and equivalent expected risk as learning with uncorrupted labels in ideal conditions, and derive closed-form solutions when conditions are non-ideal. Rank Pruning achieves state-of-the-art noise rate estimation and F1, error, and AUC-PR on the MNIST and CIFAR datasets, regardless of noise rates. To highlight, Rank Pruning with a CNN classifier can predict if a MNIST digit is a one or not one with only 0:25% error, and 0:46% error across all digits, even when 50% of positive examples are mislabeled and 50% of observed positive labels are mislabeled negative examples. Rank Pruning achieves similarly impressive results when as large as 50% of training examples are actually just noise drawn from a third distribution. Together, the CAMEO and Rank Pruning algorithms allow for a robust, general, and time-efficient solution to the CAMEO cheating detection problem. By ensuring the validity of MOOC credentials, we enable MOOCs to achieve both openness and value, and thus take one step closer to the greater goal of democratization of education.

Curtis G. Northcutt

[1] Emiliano Miluzzo,et al. A survey of mobile phone sensing , 2010, IEEE Communications Magazine.

[2] Rong Jin,et al. Multiple Kernel Learning from Noisy Labels by Stochastic Programming , 2012, ICML.

[3] Mark Goadrich,et al. The relationship between Precision-Recall and ROC curves , 2006, ICML.

[4] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[5] Nathan Intrator,et al. Bootstrapping with Noise: An Effective Regularization Technique , 1996, Connect. Sci..

[6] John R. Anderson,et al. Cognitive Tutors: Lessons Learned , 1995 .

[7] Alexander G. Hauptmann,et al. Massive Open Online Proctor: Protecting the Credibility of MOOCs certificates , 2015, CSCW.

[8] S. G. Ponnambalam,et al. Trends in Intelligent Robotics, Automation, and Manufacturing , 2012, Communications in Computer and Information Science.

[9] Albert Fornells,et al. A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[10] Joseph Jay Williams,et al. HarvardX and MITx: Two Years of Open Online Courses Fall 2012-Summer 2014 , 2015 .

[11] Diane J. Prince,et al. Comparisons of Proctored versus Non-Proctored Testing Strategies in Graduate Distance Education Curriculum. , 2011 .

[12] Olivier Chapelle,et al. Model Selection for Support Vector Machines , 1999, NIPS.

[13] George Siemens,et al. The MOOC model for digital practice , 2010 .

[14] Donald L. Mccabe,et al. Cheating: Why Students Do It and How We Can Help Them Stop. , 2001 .

[15] Ian H. Witten,et al. One-Class Classification by Combining Density and Class Probability Estimation , 2008, ECML/PKDD.

[16] Gayle S. Christensen,et al. The MOOC Phenomenon: Who Takes Massive Open Online Courses and Why? , 2013 .

[17] Rasil Warnakulasooriya,et al. Patterns, correlates, and reduction of homework copying , 2010 .

[18] Richard G. Baraniuk,et al. Bayesian pairwise collaboration detection in educational datasets , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[19] Gilles Blanchard,et al. Classification with Asymmetric Label Noise: Consistency and Maximal Denoising , 2013, COLT.

[20] Alex B. Van Zant,et al. “I can't lie to your face”: Minimal face-to-face interaction promotes honesty , 2014 .

[21] David G. Rand,et al. The online laboratory: conducting experiments in a real labor market , 2010, ArXiv.

[22] E. Xing,et al. Towards an Integration of Text and Graph Clustering Methods as a Lens for Studying Social Interaction in MOOCs , 2014 .

[23] E. L. Lehmann,et al. Theory of point estimation , 1950 .

[24] G. O. Wesolowsky,et al. Detecting excessive similarity in answers on multiple choice exams , 2000 .

[25] Giora Alexandron,et al. Evidence of MOOC Students Using Multiple Accounts to Harvest Correct Answers , 2015 .

[26] Justin Reich,et al. HarvardX and MITx: The First Year of Open Online Courses, Fall 2012-Summer 2013 , 2014 .

[27] Edward Cutrell,et al. Measuring and Maximizing the Effectiveness of Honor Codes in Online Courses , 2015, L@S.

[28] J. R. Quinlan. Induction of decision trees , 2004, Machine Learning.

[29] James G. Mazoué,et al. The MOOC Model: Challenging Traditional Education , 2014 .

[30] Pietro Perona,et al. Pruning training sets for learning of object categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[31] Xiaogang Wang,et al. Learning from massive noisy labeled data for image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] K. Koedinger,et al. Exploring the Assistance Dilemma in Experiments with Cognitive Tutors , 2007 .

[33] Isaac L. Chuang,et al. Detecting and preventing "multiple-account" cheating in massive open online courses , 2015, Comput. Educ..

[34] Xiaoqian Jiang,et al. Predicting accurate probabilities with a ranking loss , 2012, ICML.

[35] Jukka Mäkelä,et al. From learning to e-learning to m-learning to c-learning to …? , 2014, 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO).

[36] Jean-Philippe Vert,et al. A bagging SVM to learn from positive and unlabeled examples , 2010, Pattern Recognit. Lett..

[37] Dana Angluin,et al. Learning from noisy examples , 1988, Machine Learning.

[38] Linda Klebe Trevino,et al. Academic Integrity in Honor Code and Non-Honor Code Environments: A Qualitative Investigation , 1999 .

[39] Avrim Blum,et al. The Bottleneck , 2021, Monopsony Capitalism.

[40] Robert J. Kauffman,et al. The effects of shilling on final bid prices in online auctions , 2005, Electron. Commer. Res. Appl..

[41] Edward Cutrell,et al. Deterring Cheating in Online Environments , 2015, TCHI.

[42] Hangjung Zo,et al. Understanding the MOOCs continuance: The role of openness and reputation , 2015, Comput. Educ..

[43] James A. Wollack,et al. A Nominal Response Model Approach for Detecting Answer Copying , 1997 .

[44] Rob Phillips,et al. Something for everyone: MOOC Design for informing dementia education and research , 2013 .

[45] Dumitru Erhan,et al. Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[46] Damminda Alahakoon,et al. Minority report in fraud detection: classification of skewed data , 2004, SKDD.

[47] Alexander Gammerman,et al. Learning by Transduction , 1998, UAI.