Compact Object Representation of a Non-Rigid Object for Real-Time Tracking in AR Systems

Detecting moving objects in the real world with reliability, robustness and efficiency is an essential but difficult task in AR applications, especially for interactions between virtual agents and real pedestrians, motorcycles and more, where the spatial occupancy of non-rigid objects should be perceived. In this paper, a novel object tracking method using visual cues with pre-training is proposed to track dynamic objects in 2D online videos robustly and reliably. The object's area in images can be transformed to 3D spatial area in the physical world with some simple, well-defined constraints and priors, thus spatial collision of agents and pedestrians can be avoided in AR environments. To achieve robust tracking in a markerless AR environment, we first create a novel representation of non-rigid objects, which is actually the manifold of normalized sub-images of all the possible appearances of the target object. These sub-images, captured from multiple views and under varying lighting conditions, are free from any occlusion and can be obtained from both video sequences and synthetic image generation. Then, from the instance pool made up of these sub-images, a compact set of templates which can well represent the manifold is learned by our proposed iterative method using sparse dictionary learning. We ensure that this template set is complete by using an SVM-based sparsity detection method. This compact, complete set of templates is then used to track the target trajectory online in video and augmented reality (AR) systems. Experiments demonstrate the robustness and efficiency of our method.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ming-Hsuan Yang,et al.  Interacting Multiview Tracker , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Michael Felsberg,et al.  ECO: Efficient Convolution Operators for Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Margrit Betke,et al.  Randomized Ensemble Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[6]  Seunghoon Hong,et al.  Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network , 2015, ICML.

[7]  Hongdong Li,et al.  Beyond Local Search: Tracking Objects Everywhere with Instance-Specific Proposals , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Bohyung Han,et al.  Learning Multi-domain Convolutional Neural Networks for Visual Tracking , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yue Liu,et al.  Visual Tracking via Sparse and Local Linear Coding , 2015, IEEE Transactions on Image Processing.

[10]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Stéphane Donikian,et al.  Online inserting virtual characters into dynamic video scenes , 2011, Comput. Animat. Virtual Worlds.

[12]  J. L. Roux An Introduction to the Kalman Filter , 2003 .

[13]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[14]  Olac Fuentes,et al.  Object detection using image reconstruction with PCA , 2009, Image Vis. Comput..

[15]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Bruce A. Draper,et al.  Visual object tracking using adaptive correlation filters , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.