Structured Prediction by Least Squares Estimated Conditional Risk Minimization

We propose a general approach for supervised learning with structured output spaces, such as combinatorial and polyhedral sets, that is based on minimizing estimated conditional risk functions. Given a loss function defined over pairs of output labels, we first estimate the conditional risk function by solving a (possibly infinite) collection of regularized least squares problems. A prediction is made by solving an auxiliary optimization problem that minimizes the estimated conditional risk function over the output space. We apply this method to a class of problems with discrete combinatorial outputs and additive pairwise losses, and show that the auxiliary problem can be solved efficiently by exact linear programming relaxations in several important cases, including variants of hierarchical multilabel classification and multilabel ranking problems. We demonstrate how the same approach can also be extended to vector regression problems with convex constraints and losses. Evaluations of this approach on hierarchical multilabel classification show that it compares favorably with several existing methods in terms of predictive accuracy, and has computational advantages over them when applied to large hierarchies.