TMAGIC: A Model-Free 3D Tracker

Significant effort has been devoted within the visual tracking community to rapid learning of object properties on the fly. However, state-of-the-art approaches still often fail in cases such as rapid out-of-plane rotation, when the appearance changes suddenly. One of the major contributions of this paper is a radical rethinking of the traditional wisdom of modeling 3D motion as appearance changes during tracking. Instead, 3D motion is modeled as 3D motion. This intuitive but previously unexplored approach provides new possibilities in visual tracking research. First, 3D tracking is more general, as large out-of-plane motion is often fatal for 2D trackers, but helps 3D trackers to build better models. Second, the tracker’s internal model of the object can be used in many different applications and it could even become the main motivation, with tracking supporting reconstruction rather than vice versa. This effectively bridges the gap between visual tracking and structure from motion. A new benchmark data set of sequences with extreme out-of-plane rotation is presented and an online leader-board offered to stimulate new research in the relatively underdeveloped area of 3D tracking. The proposed method, provided as a baseline, is capable of successfully tracking these sequences, all of which pose a considerable challenge to 2D trackers (error reduced by 46%).

[1]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[2]  Lilian Zhang,et al.  Line primitives and their applications in geometric computer vision , 2013 .

[3]  Andrew Zisserman,et al.  MLESAC: A New Robust Estimator with Application to Estimating Image Geometry , 2000, Comput. Vis. Image Underst..

[4]  K. Madhava Krishna,et al.  Realtime multibody visual SLAM with a smoothly moving monocular camera , 2011, 2011 International Conference on Computer Vision.

[5]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[6]  Daniel Wagner,et al.  User friendly SLAM initialization , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[7]  Rafael Grompone von Gioi,et al.  LSD: A Fast Line Segment Detector with a False Detection Control , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Tat-Jen Cham,et al.  Visual tracking with generative template model based on Riemannian manifold of covariances , 2011, 14th International Conference on Information Fusion.

[9]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[10]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[11]  Jianxiong Xiao,et al.  Tracking Revisited Using RGBD Camera: Unified Benchmark and Baselines , 2013, 2013 IEEE International Conference on Computer Vision.

[12]  Bernt Schiele,et al.  Monocular Visual Scene Understanding: Understanding Multi-Object Traffic Scenes , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Noah Snavely Photo Tourism : Exploring image collections in 3D , 2006 .

[14]  Ian D. Reid,et al.  Dense Reconstruction Using 3D Object Shape Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Jiri Matas,et al.  The Enhanced Flock of Trackers , 2014, Registration and Recognition in Images and Videos.

[16]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Ales Leonardis,et al.  Robust Visual Tracking Using an Adaptive Coupled-Layer Visual Model , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[19]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[20]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[22]  Horst Bischof,et al.  Online Feedback for Structure-from-Motion Image Acquisition , 2012, BMVC.

[23]  Jiri Matas,et al.  Fixing the Locally Optimized RANSAC , 2012, BMVC.

[24]  Michal Havlena,et al.  3D reconstruction from photographs by CMP SfM web service , 2015, 2015 14th IAPR International Conference on Machine Vision Applications (MVA).

[25]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[26]  Lixin Fan,et al.  On-line Object Reconstruction and Tracking for 3D Interaction , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[27]  Richard Bowden,et al.  2D or Not 2D: Bridging the Gap Between Tracking and Structure from Motion , 2014, ACCV.

[28]  Ian D. Reid,et al.  Real-Time Monocular SLAM with Straight Lines , 2006, BMVC.

[29]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[31]  Shi-Min Hu,et al.  3D indoor scene modeling from RGB-D data: a survey , 2015, Computational Visual Media.

[32]  Robert T. Collins,et al.  On-the-fly Object Modeling while Tracking , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Tom Drummond,et al.  ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition , 2009, BMVC.

[34]  Olaf Kähler,et al.  Simultaneous 3D tracking and reconstruction on a mobile phone , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[35]  Jiri Matas,et al.  Locally Optimized RANSAC , 2003, DAGM-Symposium.

[36]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[37]  Jiri Matas,et al.  Texture-Independent Long-Term Tracking Using Virtual Corners , 2016, IEEE Transactions on Image Processing.

[38]  Michael Isard,et al.  Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation , 2011, International Journal of Computer Vision.

[39]  Vincent Lepetit,et al.  Keyframe-based modeling and tracking of multiple 3D objects , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.

[40]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.