Unsupervised Supervised Learning I: Estimating Classification and Regression Errors without Labels

Estimating the error rates of classifiers or regression models is a fundamental task in machine learning which has thus far been studied exclusively using supervised learning techniques. We propose a novel unsupervised framework for estimating these error rates using only unlabeled data and mild assumptions. We prove consistency results for the framework and demonstrate its practical applicability on both synthetic and real world data.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Stephen E. Fienberg,et al.  Discrete Multivariate Analysis: Theory and Practice , 1976 .

[3]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[4]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  David A. Cox,et al.  Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, 3/e (Undergraduate Texts in Mathematics) , 2007 .

[7]  Pietro Perona,et al.  Inferring Ground Truth from Subjective Labelling of Venus Images , 1994, NIPS.

[8]  Dinesh Manocha,et al.  SOLVING SYSTEMS OF POLYNOMIAL EQUATIONS , 2002 .

[9]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[10]  D. Eisenbud,et al.  Computational Algebraic Gedometry and Commutative Algebra. , 1995 .

[11]  T. Ferguson A Course in Large Sample Theory , 1996 .

[12]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[13]  Donal O'Shea,et al.  Ideals, varieties, and algorithms - an introduction to computational algebraic geometry and commutative algebra (2. ed.) , 1997, Undergraduate texts in mathematics.

[14]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[15]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[16]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[17]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[18]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[19]  Brendan T. O'Connor,et al.  Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks , 2008, EMNLP.

[20]  Panagiotis G. Ipeirotis,et al.  Get another label? improving data quality and data mining using multiple, noisy labelers , 2008, KDD.