Guest editorial: special issue on structured prediction

Structured Prediction or Structured Classification (Bakir et al. 2007) is the task of predicting a collection of related variables given some input. The relationship between the variables to be predicted is often complex. An example of such complex dependencies is machine translation, where the input is a sequence of words in the source natural language and the output is a sequence of words in the target natural language. Here, each word in the target language relates not only to the words in the source language, but also to the other (arbitrarily far) words in the target sequence. As a field, structured prediction has some unique challenges, several of which are addressed by the papers in this issue. One of the most obvious of these is that the output spaces in question are often exponential in size, and so the complexity, both of these spaces and the learned models, can result in computationally infeasible learning and inference algorithms. In many cases, therefore, efficient optimization remains an open problem in structured prediction. Two papers in this special issue (Hsu et al. 2009; Sutton and McCallum 2009) address this problem. An important source of complexity in structured prediction algorithms is in the iterative nature of the training step: Often, training is done EM-style, where model parameters are estimated at each step, and then inference is performed based on these parameters. In models with complex structure, this inference step

[1]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[2]  Daniel Marcu,et al.  Learning as search optimization: approximate large margin methods for structured prediction , 2005, ICML.

[3]  Andrew McCallum,et al.  Piecewise training for structured prediction , 2009, Machine Learning.

[4]  Csaba Szepesvári,et al.  Training parsers by inverse reinforcement learning , 2009, Machine Learning.

[5]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[6]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[7]  Thomas Hofmann,et al.  Predicting Structured Data (Neural Information Processing) , 2007 .

[8]  Yi Mao,et al.  Generalized isotonic conditional random fields , 2009, Machine Learning.

[9]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[10]  Gérard Bloch,et al.  Incorporating prior knowledge in support vector machines for classification: A review , 2008, Neurocomputing.

[11]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[12]  Ludovic Denoyer,et al.  Structured prediction with reinforcement learning , 2009, Machine Learning.

[13]  Christoph H. Lampert,et al.  Structured prediction by joint kernel support estimation , 2009, Machine Learning.

[14]  Yuh-Jye Lee,et al.  Periodic step-size adaptation in second-order gradient descent for single-pass on-line structured learning , 2009, Machine Learning.