Static prediction games for adversarial learning problems

The standard assumption of identically distributed training and test data is violated when the test data are generated in response to the presence of a predictive model. This becomes apparent, for example, in the context of email spam filtering. Here, email service providers employ spam filters, and spam senders engineer campaign templates to achieve a high rate of successful deliveries despite the filters. We model the interaction between the learner and the data generator as a static game in which the cost functions of the learner and the data generator are not necessarily antagonistic. We identify conditions under which this prediction game has a unique Nash equilibrium and derive algorithms that find the equilibrial prediction model. We derive two instances, the Nash logistic regression and the Nash support vector machine, and empirically explore their properties in a case study on email spam filtering.

[1]  J. Goodman Note on Existence and Uniqueness of Equilibrium Points for Concave N-Person Games , 1965 .

[2]  T. Başar,et al.  Dynamic Noncooperative Game Theory , 1982 .

[3]  Patrick T. Harker,et al.  Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications , 1990, Math. Program..

[4]  Charles R. Johnson,et al.  Topics in Matrix Analysis , 1991 .

[5]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[6]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[7]  Bernhard Schölkopf,et al.  A Generalized Representer Theorem , 2001, COLT/EuroCOLT.

[8]  Michael I. Jordan,et al.  A Robust Minimax Approach to Classification , 2003, J. Mach. Learn. Res..

[9]  Christian Kanzow,et al.  Theorie und Numerik restringierter Optimierungsaufgaben , 2002 .

[10]  L. Ghaoui,et al.  Robust Classification with Interval Data , 2003 .

[11]  William S. Yerazunis,et al.  Combining Winnow and Orthogonal Sparse Bigrams for Incremental Spam Filtering , 2004, PKDD.

[12]  Ivor W. Tsang,et al.  The pre-image problem in kernel methods , 2003, IEEE Transactions on Neural Networks.

[13]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[14]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[15]  Alexander J. Smola,et al.  Convex Learning with Invariances , 2007, NIPS.

[16]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML.

[17]  Tobias Scheffer,et al.  Nash Equilibria of Static Prediction Games , 2009, NIPS.

[18]  C. Kanzow,et al.  Relaxation Methods for Generalized Nash Equilibrium Problems with Inexact Line Search , 2009 .

[19]  S. Roweis,et al.  An Adversarial View of Covariate Shift and A Minimax Approach , 2009 .

[20]  Masashi Sugiyama,et al.  An Adversarial View of Covariate Shift and a Minimax Approach , 2009 .

[21]  Neil D. Lawrence,et al.  Dataset Shift in Machine Learning , 2009 .

[22]  Tobias Scheffer,et al.  Stackelberg games for adversarial prediction problems , 2011, KDD.