Déjà Vu: - Motion Prediction in Static Images

This paper proposes motion prediction in single still images by learning it from a set of videos. The building assumption is that similar motion is characterized by similar appearance. The proposed method learns local motion patterns given a specific appearance and adds the predicted motion in a number of applications. This work (i) introduces a novel method to predict motion from appearance in a single static image, (ii) to that end, extends of the Structured Random Forest with regression derived from first principles, and (iii) shows the value of adding motion predictions in different tasks such as: weak frame-proposals containing unexpected events, action recognition, motion saliency. Illustrative results indicate that motion prediction is not only feasible, but also provides valuable information for a number of applications.

[1]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[2]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Peter Kontschieder,et al.  Structured class-labels in random forests for semantic image labelling , 2011, 2011 International Conference on Computer Vision.

[4]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[5]  Nazli Ikizler-Cinbis,et al.  Learning actions from the Web , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Jan C. van Gemert,et al.  Exploiting photographic style for category-level image classification by generalizing the spatial pyramid , 2011, ICMR.

[7]  Cordelia Schmid,et al.  Learning object class detectors from weakly annotated video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Antonio Criminisi,et al.  Filter Forests for Learning Data-Dependent Convolutional Kernels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Cordelia Schmid,et al.  Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[10]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[11]  Nelson L. Max,et al.  Visualizing 3D velocity fields near contour surfaces , 1994, Proceedings Visualization '94.

[12]  Ying Wu,et al.  Motion from blur , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Ivan Laptev,et al.  Recognizing human actions in still images: a study of bag-of-features and part-based representations , 2010, BMVC.

[14]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[15]  C. Lawrence Zitnick,et al.  Structured Forests for Fast Edge Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[17]  Cordelia Schmid,et al.  Evaluation of Local Spatio-temporal Features for Action Recognition , 2009, BMVC.

[18]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[19]  Sebastian Nowozin,et al.  Decision Jungles: Compact and Rich Models for Classification , 2013, NIPS.

[20]  Frédéric Jurie,et al.  Motion Models that Only Work Sometimes , 2012, BMVC.

[21]  Brian Cabral,et al.  Imaging vector fields using line integral convolution , 1993, SIGGRAPH.

[22]  Edward H. Adelson,et al.  Motion without movement , 1991, SIGGRAPH.

[23]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Theo Gevers,et al.  Evaluation of Color STIPs for Human Action Recognition , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[26]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[27]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[28]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[29]  Antonio Torralba,et al.  SIFT Flow: Dense Correspondence across Different Scenes , 2008, ECCV.

[30]  Cor J. Veenman,et al.  Episode-Constrained Cross-Validation in Video Concept Retrieval , 2009, IEEE Transactions on Multimedia.

[31]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.