Parameter Learning for Loopy Markov Random Fields with Structural Support Vector Machines

Discriminative learners for functions with structured outputs have been successfully applied to sequence prediction, parsing, sequence alignment, and other problems where exact inference is tractable. However, these learners face challenges when inference procedures must use approximations, e.g., for multi-label classification, clustering, image segmentation, or loopy graphical models. In this paper, we explore methods for training structural SVMs on problems where exact inference is intractable. In particular, we consider pairwise fully connected Markov random fields, using multi-label classification as an example application. We show how to adapt loopy belief propagation, greedy search, and linear programming relaxations as (approximate) separation oracles in the structural SVM cutting-plane training algorithm. In addition to theoretical characterizations, we empirically evaluate and analyze the resulting algorithms on six multi-label classification datasets.

[1]  Marcel Worring,et al.  The challenge problem for automated detection of 101 semantic concepts in multimedia , 2006, MM '06.

[2]  Thorsten Joachims,et al.  Supervised clustering with support vector machines , 2005, ICML.

[3]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[4]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[5]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[6]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[7]  Claudio Gentile,et al.  Hierarchical classification: combining Bayes with SVM , 2006, ICML.

[8]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9]  Jiebo Luo,et al.  Learning multi-label scene classification , 2004, Pattern Recognit..

[10]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[11]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[12]  Thorsten Joachims,et al.  Support Vector Training of Protein Alignment Models , 2007, RECOMB.

[13]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[14]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[15]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.