Smoothness, Disagreement Coefficient, and the Label Complexity of Agnostic Active Learning

We study pool-based active learning in the presence of noise, that is, the agnostic setting. It is known that the effectiveness of agnostic active learning depends on the learning problem and the hypothesis space. Although there are many cases on which active learning is very useful, it is also easy to construct examples that no active learning algorithm can have an advantage. Previous works have shown that the label complexity of active learning relies on the disagreement coefficient which often characterizes the intrinsic difficulty of the learning problem. In this paper, we study the disagreement coefficient of classification problems for which the classification boundary is smooth and the data distribution has a density that can be bounded by a smooth function. We prove upper and lower bounds for the disagreement coefficients of both finitely and infinitely smooth problems. Combining with existing results, it shows that active learning is superior to passive supervised learning for smooth problems.

[1]  I. J. Schoenberg The Elementary Cases of Landau's Problem of Inequalities Between Derivatives , 1973 .

[2]  A. Tsybakov,et al.  Optimal aggregation of classifiers in statistical learning , 2003 .

[3]  Sanjoy Dasgupta,et al.  A General Agnostic Active Learning Algorithm , 2007, ISAIM.

[4]  John Langford,et al.  Importance weighted active learning , 2008, ICML '09.

[5]  Steve Hanneke Rates of convergence in active learning , 2011, 1103.1790.

[6]  A. W. van der Vaart,et al.  Uniform Central Limit Theorems , 2001 .

[7]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[8]  Vladimir Koltchinskii,et al.  Rademacher Complexities and Bounding the Excess Risk in Active Learning , 2010, J. Mach. Learn. Res..

[9]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[10]  Robert D. Nowak,et al.  Minimax Bounds for Active Learning , 2007, IEEE Transactions on Information Theory.

[11]  John Langford,et al.  Agnostic Active Learning Without Constraints , 2010, NIPS.

[12]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[13]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[14]  V. Koltchinskii Local Rademacher complexities and oracle inequalities in risk minimization , 2006, 0708.0083.

[15]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.

[16]  Steve Hanneke,et al.  A bound on the label complexity of agnostic active learning , 2007, ICML '07.

[17]  Steve Hanneke,et al.  Adaptive Rates of Convergence in Active Learning , 2009, COLT.

[18]  A. Gorny,et al.  Contribution a l’étude des fonctions dérivables d’une variable réelle , 1939 .

[19]  E. Landau,et al.  Einige Ungleichungen Für Zweimal Differentiierbare Funktionen , 1914 .

[20]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[21]  Eric Friedman,et al.  Active Learning for Smooth Problems , 2009, COLT.

[22]  Josip Pečarić,et al.  Inequalities Involving Functions and Their Integrals and Derivatives , 1991 .

[23]  Liwei Wang,et al.  Sufficient Conditions for Agnostic Active Learnable , 2009, NIPS.