Teaching Dimension and the Complexity of Active Learning

We study the label complexity of pool-based active learning in the PAC model with noise. Taking inspiration from extant literature on Exact learning with membership queries, we derive upper and lower bounds on the label complexity in terms of generalizations of extended teaching dimension. Among the contributions of this work is the first nontrivial general upper bound on label complexity in the presence of persistent classification noise.

[1]  M. Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[2]  John N. Tsitsiklis,et al.  Active Learning Using Arbitrary Binary Valued Queries , 1993, Machine Learning.

[3]  Sanjoy Dasgupta,et al.  Coarse sample complexity bounds for active learning , 2005, NIPS.

[4]  Matti Kääriäinen,et al.  Active Learning in the Non-realizable Case , 2006, ALT.

[5]  John Langford,et al.  Agnostic active learning , 2006, J. Comput. Syst. Sci..

[6]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[7]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[8]  R. Khan,et al.  Sequential Tests of Statistical Hypotheses. , 1972 .

[9]  Ziv Bar-Yossef,et al.  Sampling lower bounds via information theory , 2003, STOC '03.

[10]  Dana Angluin,et al.  Queries revisited , 2001, Theoretical Computer Science.

[11]  Tibor Hegedűs,et al.  Generalized teaching dimensions and the query complexity of learning , 1995, Annual Conference Computational Learning Theory.