SpeedMachines: Anytime Structured Prediction

Structured prediction plays a central role in machine learning applications from computational biology to computer vision. These models require significantly more computation than unstructured models, and, in many applications, algorithms may need to make predictions within a computational budget or in an anytime fashion. In this work we propose an anytime technique for learning structured prediction that, at training time, incorporates both structural elements and feature computation trade-offs that affect test-time inference. We apply our technique to the challenging problem of scene understanding in computer vision and demonstrate efficient and anytime predictions that gradually improve towards state-of-the-art classification performance as the allotted time increases.

[1]  Philip H. S. Torr,et al.  What, Where and How Many? Combining Object Detectors and CRFs , 2010, ECCV.

[2]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  J. Andrew Bagnell,et al.  SpeedBoost: Anytime Prediction with Uniform Near-Optimality , 2012, AISTATS.

[4]  Ben Taskar,et al.  Structured Prediction Cascades , 2010, AISTATS.

[5]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[6]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Martial Hebert,et al.  Discriminative Random Fields , 2006, International Journal of Computer Vision.

[8]  Daphne Koller,et al.  Active Classification based on Value of Classifier , 2011, NIPS.

[9]  Peter L. Bartlett,et al.  Functional Gradient Techniques for Combining Hypotheses , 2000 .

[10]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[11]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Lubor Ladicky,et al.  Global structured models towards scene understanding , 2011 .

[13]  Luc Van Gool,et al.  On-line semantic perception using uncertainty , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[15]  GouldStephen,et al.  Multi-Class Segmentation with Relative Location Prior , 2008 .

[16]  Philip H. S. Torr,et al.  What , Where & How Many ? Combining Object Detectors and CRFs , 2010 .

[17]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[18]  William W. Cohen,et al.  Stacked Sequential Learning , 2005, IJCAI.

[19]  J. Langford,et al.  Search-Based Structured Prediction as Classification , 2022 .

[20]  Kilian Q. Weinberger,et al.  The Greedy Miser: Learning under Test-time Budgets , 2012, ICML.

[21]  Adam R. Teichert,et al.  Learned Prioritization for Trading Off Accuracy and Speed , 2012, NIPS.

[22]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[23]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[24]  Trevor Darrell,et al.  Timely Object Recognition , 2012, NIPS.

[25]  Martial Hebert,et al.  Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.

[26]  Svetlana Lazebnik,et al.  Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[27]  Philip H. S. Torr,et al.  Scalable Cascade Inference for Semantic Image Segmentation , 2012, BMVC.

[28]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[29]  Philip H. S. Torr,et al.  Combining Appearance and Structure from Motion Features for Road Scene Understanding , 2009, BMVC.

[30]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[31]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.

[32]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[33]  Stephen Gould,et al.  Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.