Robust Inference for Multiclass Classification

We consider the problem of robust inference in which inputs may be maliciously corrupted by a powerful adversary, and the learner’s goal is to accurately predict the original, uncorrupted input’s true label given only the adversarially corrupted version of the input. We specifically focus on the multiclass version of this problem in which more than two labels are possible. We substantially extend and generalize previous work which had only considered the binary case, thus uncovering stark differences between the two cases. We show how robust inference can be modeled as a zero-sum game between a learner who maximizes the expected accuracy, and an adversary. The value of this game is the best-attainable accuracy rate of any algorithm. We then show how the optimal policy for both the learner and adversary can be exactly characterized in terms of a particular hypergraph, specifically, as the hypergraph’s maximum fractional independent set and minimum fractional set cover, respectively. This characterization yields efficient algorithms in the size of the domain (number of possible inputs). For the typical setting that the domain is huge, we also design efficient local computation algorithms for approximating maximum fractional independent set in hypergraphs. This leads to a near optimal algorithm for the learner whose complexity is independent of the domain size, instead depending only on the rank and maximum degree of the underlying hypergraph, and on the desired approximation ratio.

[1]  Yishay Mansour,et al.  Converting Online Algorithms to Local Computation Algorithms , 2012, ICALP.

[2]  Dana Angluin,et al.  Learning from noisy examples , 1988, Machine Learning.

[3]  Noga Alon,et al.  Space-efficient local computation algorithms , 2011, SODA.

[4]  Dana Ron,et al.  Best of Two Local Models: Local Centralized and Local Distributed Algorithms , 2014, ArXiv.

[5]  Moshe Tennenholtz,et al.  Robust Probabilistic Inference , 2015, SODA.

[6]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[7]  Dana Ron,et al.  On Approximating the Minimum Vertex Cover in Sublinear Time and the Connection to Distributed Algorithms , 2007, Electron. Colloquium Comput. Complex..

[8]  Uriel Feige,et al.  Learning and inference in the presence of corrupted inputs , 2015, COLT.

[9]  Zeyuan Allen Zhu,et al.  Using Optimization to Break the Epsilon Barrier: A Faster and Simpler Width-Independent Algorithm for Solving Positive Linear Programs in Parallel , 2014, SODA.

[10]  Sanjeev Arora,et al.  The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..

[11]  Alexander J. Smola,et al.  Convex Learning with Invariances , 2007, NIPS.

[12]  Ronitt Rubinfeld,et al.  Fast Local Computation Algorithms , 2011, ICS.

[13]  Amir Globerson,et al.  Nightmare at test time: robust learning by feature deletion , 2006, ICML.

[14]  Leslie G. Valiant,et al.  Learning Disjunction of Conjunctions , 1985, IJCAI.

[15]  Yishay Mansour,et al.  A Local Computation Approximation Scheme to Maximum Matching , 2013, APPROX-RANDOM.

[16]  Laurent El Ghaoui,et al.  Robust Optimization , 2021, ICORES.

[17]  Ohad Shamir,et al.  Learning to classify with missing and corrupted features , 2008, ICML.