Large margin methods for structured classification : Exponentiated gradient algorithms and PAC-Bayesian generalization bounds

We consider the problem of structured classification , where the task is to predict a label y from an inputx, andy has meaningful internal structure. Our framework includes supervised training of both Markov random fields and weighted context-free grammars as special cases. We describe an algorithm that solves the large-margin optimization problem defined in [12], using an exponentialfamily (Gibbs distribution) representation of structured objects. The algorithm is efficient – even in cases where the number of labels y i exponential in size – provided that certain expectations under Gibbs distributions can be calculated efficiently. The optimization method we use for structured labels relies on a more general result, specifically the application of exponentiated gradient (EG) updates [4, 5] to quadratic programs (QPs). We describe a new method for solving QPs based on these techniques, and give bounds on its rate of convergence. In addition to their application to the structured-labels task, the EG updates lead to simple algorithms for optimizing “conventional” binary or multiclass SVM problems. Finally, we give a new generalization bound for structured classification, using PAC-Bayesian methods for the analysis of large margin classifiers.