On Correcting Inputs: Inverse Optimization for Online Structured Prediction

Algorithm designers typically assume that the input data is correct, and then proceed to find "optimal" or "sub-optimal" solutions using this input data. However this assumption of correct data does not always hold in practice, especially in the context of online learning systems where the objective is to learn appropriate feature weights given some training samples. Such scenarios necessitate the study of inverse optimization problems where one is given an input instance as well as a desired output and the task is to adjust the input data so that the given output is indeed optimal. Motivated by learning structured prediction models, in this paper we consider inverse optimization with a margin, i.e., we require the given output to be better than all other feasible outputs by a desired margin. We consider such inverse optimization problems for maximum weight matroid basis, matroid intersection, perfect matchings, minimum cost maximum flows, and shortest paths and derive the first known results for such problems with a non-zero margin. The effectiveness of these algorithmic approaches to online learning for structured prediction is also discussed.

[1]  Chris Callison-Burch,et al.  Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[2]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[3]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[4]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[5]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[6]  Zhenhong Liu,et al.  A Strongly Polynomial Algorithm for the Inverse Shortest Arborescence Problem , 1998, Discret. Appl. Math..

[7]  Ryan P. Adams,et al.  Randomized Optimum Models for Structured Prediction , 2012, AISTATS.

[8]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[9]  George Papandreou,et al.  Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models , 2011, 2011 International Conference on Computer Vision.

[10]  Clemens Heuberger,et al.  Inverse Combinatorial Optimization: A Survey on Problems, Methods, and Results , 2004, J. Comb. Optim..

[11]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[12]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[13]  Dan Roth,et al.  The Use of Classifiers in Sequential Inference , 2001, NIPS.

[14]  András Frank,et al.  A Weighted Matroid Intersection Algorithm , 1981, J. Algorithms.

[15]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[16]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17]  Fernando Pereira,et al.  Case-factor diagrams for structured probabilistic modeling , 2004, J. Comput. Syst. Sci..

[18]  Ben Taskar,et al.  A Discriminative Matching Approach to Word Alignment , 2005, HLT.

[19]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[20]  David Chiang,et al.  Hope and Fear for Discriminative Training of Statistical Translation Models , 2012, J. Mach. Learn. Res..

[21]  Satoru Fujishige,et al.  A PRIMAL APPROACH TO THE INDEPENDENT ASSIGNMENT PROBLEM , 1977 .

[22]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[23]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[24]  Yanjun Li,et al.  Inverse Matroid Intersection Problem , 1997, Math. Methods Oper. Res..

[25]  Mauro Dell'Amico,et al.  The Base-matroid and Inverse Combinatorial Optimization Problems , 2003, Discret. Appl. Math..