On-line learning with malicious noise and the closure algorithm

We investigate a variant of the on-line learning model for classes of \0,1\-valued functions (concepts) in which the labels of a certain amount of the input instances are corrupted by adversarial noise. We propose an extension of a general learning strategy, known as “Closure Algorithm”, to this noise model, and show a worst-case mistake bound of m + (d+1)K for learning an arbitrary intersection-closed concept class C, where K is the number of noisy labels, d is a combinatorial parameter measuring C's complexity, and m is the worst-case mistake bound of the Closure Algorithm for learning C in the noise-free model. For several concept classes our extended Closure Algorithm is efficient and can tolerate a noise rate up to the information-theoretic upper bound. Finally, we show how to efficiently turn any algorithm for the on-line noise model into a learning algorithm for the PAC model with malicious noise.

[1]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[2]  Balas K. Natarajan,et al.  On learning Boolean functions , 1987, STOC.

[3]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[4]  David K. Smith Theory of Linear and Integer Programming , 1987 .

[5]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[6]  David Haussler,et al.  Predicting {0,1}-functions on randomly drawn points , 1988, COLT '88.

[7]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[8]  Manfred K. Warmuth,et al.  Learning nested differences of intersection-closed concept classes , 2004, Machine Learning.

[9]  Manfred K. Warmuth,et al.  The weighted majority algorithm , 1989, 30th Annual Symposium on Foundations of Computer Science.

[10]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[11]  Hans Ulrich Simon,et al.  On learning ring-sum-expansions , 1990, COLT '90.

[12]  Manfred K. Warmuth,et al.  Learning integer lattices , 1990, COLT '90.

[13]  Balas K. Natarajan,et al.  Machine Learning: A Theoretical Approach , 1992 .

[14]  Peter Auer,et al.  On-line learning of rectangles in noisy environments , 1993, COLT '93.

[15]  Zhixiang Chen,et al.  On learning counting functions with queries , 1994, COLT '94.

[16]  John Shawe-Taylor,et al.  A Result of Vapnik with Applications , 1993, Discrete Applied Mathematics.

[17]  Philip M. Long,et al.  Simulating access to hidden information while learning , 1994, STOC '94.

[18]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[19]  Philip M. Long,et al.  Tracking drifting concepts by minimizing disagreements , 2004, Machine Learning.

[20]  Nicolò Cesa-Bianchi,et al.  On-line Prediction and Conversion Strategies , 1994, Machine Learning.