Structured output-associative regression

Structured outputs such as multidimensional vectors or graphs are frequently encountered in real world pattern recognition applications such as computer vision, natural language processing or computational biology. This motivates the learning of functional dependencies between spaces with complex, interdependent inputs and outputs, as arising e.g. from images and their corresponding 3d scene representations. In this spirit, we propose a new structured learning method-Structured Output-Associative Regression (SOAR)-that models not only the input-dependency but also the self-dependency of outputs, in order to provide an output re-correlation mechanism that complements the (more standard) input-based regressive prediction. The model is simple but powerful, and, in principle, applicable in conjunction with any existing regression algorithms. SOAR can be kernelized to deal with non-linear problems and learning is efficient via primal/dual formulations not unlike ones used for kernel ridge regression or support vector regression. We demonstrate that the method outperforms weighted nearest neighbor and regression methods for the reconstruction of images of handwritten digits and for 3D human pose estimation from video in the HumanEva benchmark.

[1]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[2]  Charles A. Micchelli,et al.  On Learning Vector-Valued Functions , 2005, Neural Computation.

[3]  L. Davis,et al.  Background and foreground modeling using nonparametric kernel density estimation for visual surveillance , 2002, Proc. IEEE.

[4]  Michael J. Black,et al.  Predicting 3D People from 2D Pictures , 2006, AMDO.

[5]  Cristian Sminchisescu,et al.  Fast algorithms for large scale conditional 3D prediction , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[7]  Andrew W. Fitzgibbon,et al.  The Joint Manifold Model for Semi-supervised Multi-valued Regression , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  D. Heckerman,et al.  Dependency networks for inference , 2000 .

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Cristian Sminchisescu,et al.  Learning Joint Top-Down and Bottom-up Processes for 3D Visual Inference , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Bernhard Schölkopf,et al.  Kernel Dependency Estimation , 2002, NIPS.

[12]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Alexander Gammerman,et al.  Ridge Regression Learning Algorithm in Dual Variables , 1998, ICML.

[14]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[16]  Bernhard Schölkopf,et al.  Joint Kernel Maps , 2005, IWANN.

[17]  Martial Hebert,et al.  Discriminative Fields for Modeling Spatial Dependencies in Natural Images , 2003, NIPS.

[18]  Chih-Jen Lin,et al.  Working Set Selection Using Second Order Information for Training Support Vector Machines , 2005, J. Mach. Learn. Res..

[19]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[20]  Vladimir Pavlovic,et al.  Dimensionality reduction using covariance operator inverse regression , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[22]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[23]  Michael I. Jordan,et al.  Regression on manifolds using kernel dimension reduction , 2007, ICML '07.

[24]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[25]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[26]  Cristian Sminchisescu,et al.  Conditional Visual Tracking in Kernel Space , 2005, NIPS.

[27]  Cristian Sminchisescu,et al.  Twin Gaussian Processes for Structured Prediction , 2010, International Journal of Computer Vision.

[28]  Rómer Rosales,et al.  Learning Body Pose via Specialized Maps , 2001, NIPS.

[29]  Jason Weston,et al.  A general regression technique for learning transductions , 2005, ICML '05.

[30]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[31]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .