Learning in the Limit with Adversarial Disturbances

We study distribution-dependent, data-dependent, learning in the limit with adversarial disturbance. We consider an optimization-based approach to learning binary classifiers from data under worst-case assumptions on the disturbance. The learning process is modeled as a decision-maker who seeks to minimize generalization error, given access only to possibly maliciously corrupted data. Two models for the nature of the disturbance are considered: disturbance in the labels of a certain fraction of the data, and disturbance that also affects the position of the data points. We provide distributiondependent bounds on the amount of error as a function of the noise level for the two models, and describe the optimal strategy of the decision-maker, as well as the worst-case disturbance.

[1]  P. Laird Learning from Good and Bad Data , 1988 .

[2]  Rocco A. Servedio,et al.  Smooth Boosting and Learning with Malicious Noise , 2001, J. Mach. Learn. Res..

[3]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[4]  Eyal Kushilevitz,et al.  PAC learning with nasty noise , 1999, Theor. Comput. Sci..

[5]  Nicolò Cesa-Bianchi,et al.  On-line learning with malicious noise and the closure algorithm , 1998, Annals of Mathematics and Artificial Intelligence.

[6]  R. Schapire,et al.  Toward efficient agnostic learning , 1992, COLT '92.

[7]  Nicolò Cesa-Bianchi,et al.  Sample-efficient strategies for learning in the presence of noise , 1999, JACM.

[8]  Nicolò Cesa-Bianchi,et al.  On-line learning with malicious noise and the closure algorithm , 1994, Annals of Mathematics and Artificial Intelligence.

[9]  Arkadi Nemirovski,et al.  Robust solutions of uncertain linear programs , 1999, Oper. Res. Lett..

[10]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.