An Elastic Deformation Field Model for Object Detection and Tracking

Deformable Parts Models (DPM) are the current state-of-the-art for object detection. Nevertheless they seem sub-optimal in the representation of deformations. Object deformations are often continuous and not confined to big parts. Therefore we propose to replace the DPM star model based on big parts by a deformation field. This consists of a grid of small parts connected with pairwise constraints which can better handle continuous deformations. The naive application of this model for object detection would consist of a bounded sliding window approach: for each possible location of the image the best part configuration within a limited bound around this location is found. This is computationally very expensive.Instead, we propose a different inference procedure, where an iterative image-level search finds the best object hypothesis. We show that this approach is faster than bounded sliding windows yet produces comparable accuracy. Experiments further show that the deformation field can better approximate real object deformations and therefore, for certain classes, produces even better detection accuracy than state-of-the-art DPM. Finally, the same approach is adapted to model-free tracking, showing improved accuracy also in this case.

[1]  William T. Freeman,et al.  Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[3]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[5]  Pushmeet Kohli,et al.  Dynamic Graph Cuts for Efficient Inference in Markov Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Christoph Schnörr,et al.  A Study of Parts-Based Object Class Detection Using Complete Graphs , 2010, International Journal of Computer Vision.

[8]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[9]  Philip H. S. Torr,et al.  What , Where & How Many ? Combining Object Detectors and CRFs , 2010 .

[10]  Ashish Kapoor,et al.  Located Hidden Random Fields: Learning Discriminative Parts for Object Detection , 2006, ECCV.

[11]  Jin Ho Kim,et al.  Distortion-Invariant Object Recognition by Optimization Neural Network , 1990 .

[12]  Gregory Shakhnarovich,et al.  Diverse M-Best Solutions in Markov Random Fields , 2012, ECCV.

[13]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Nikos Komodakis,et al.  Performance vs computational efficiency for optimizing single and dynamic MRFs: Setting the state of the art with primal-dual strategies , 2008, Comput. Vis. Image Underst..

[17]  Nassir Navab,et al.  Dense image registration through MRFs and efficient linear programming , 2008, Medical Image Anal..

[18]  Andrew Zisserman,et al.  Latent SVMs for Human Detection with a Locally Affine Deformation Field , 2012, BMVC.

[19]  Pushmeet Kohli,et al.  Dynamic Hybrid Algorithms for MAP Inference in Discrete MRFs , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Daniel P. Huttenlocher,et al.  Spatial priors for part-based recognition using statistical models , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[22]  Andrew Zisserman,et al.  Structured output regression for detection with partial truncation , 2009, NIPS.

[23]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[24]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Christoph von der Malsburg,et al.  Dynamic link architecture , 1998 .

[26]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[27]  Derek Hoiem,et al.  3D LayoutCRF for Multi-View Object Class Recognition and Segmentation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Yang Wang,et al.  Learning hierarchical poselets for human parsing , 2011, CVPR 2011.

[29]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[30]  Olga Veksler,et al.  Markov random fields with efficient approximations , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[31]  Jean Ponce,et al.  A graph-matching kernel for object categorization , 2011, 2011 International Conference on Computer Vision.

[32]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[33]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[34]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..

[35]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Neil A. Dodgson,et al.  Proceedings Ninth IEEE International Conference on Computer Vision , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[37]  Lu Zhang,et al.  Structure Preserving Object Tracking , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[40]  Jordi Gonzàlez,et al.  A coarse-to-fine approach for fast deformable object detection , 2011, CVPR 2011.

[41]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.