Benchmarking template-based tracking algorithms

For natural interaction with augmented reality (AR) applications, good tracking technology is key. But unlike dense stereo, optical flow or multi-view stereo, template-based tracking which is most commonly used for AR applications lacks benchmark datasets allowing a fair comparison between state-of-the-art algorithms. Until now, in order to evaluate objectively and quantitatively the performance and the robustness of template-based tracking algorithms, mainly synthetically generated image sequences were used. The evaluation is therefore often intrinsically biased. In this paper, we describe the process we carried out to perform the acquisition of real-scene image sequences with very precise and accurate ground truth poses using an industrial camera rigidly mounted on the end effector of a high-precision robotic measurement arm. For the acquisition, we considered most of the critical parameters that influence the tracking results such as: the texture richness and the texture repeatability of the objects to be tracked, the camera motion and speed, and the changes of the object scale in the images and variations of the lighting conditions over time. We designed an evaluation scheme for object detection and interframe tracking algorithms suited for AR and other computer vision applications and used the image sequences to apply this scheme to several state-of-the-art algorithms. The image sequences are freely available for testing, submitting and evaluating new template-based tracking algorithms, i.e. algorithms that detect or track a planar object in an image sequence given only one image of the object (called the template).

[1]  Vincent Lepetit,et al.  Fast Keypoint Recognition in Ten Lines of Code , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[3]  Roger Y. Tsai,et al.  A new technique for fully autonomous and efficient 3D robotics hand/eye calibration , 1988, IEEE Trans. Robotics Autom..

[4]  Katharina Pentenrieder Analysis of Tracking Accuracy for Single-Camera Square-Marker-Based Tracking , 2007 .

[5]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[6]  Pietro Perona,et al.  Evaluation of Features Detectors and Descriptors Based on 3D Objects , 2005, ICCV.

[7]  Dieter Schmalstieg,et al.  Pose tracking from natural features on mobile phones , 2008, 2008 7th IEEE/ACM International Symposium on Mixed and Augmented Reality.

[8]  I. D. Coope,et al.  A Convergent Variant of the Nelder–Mead Algorithm , 2002 .

[9]  Jiri Matas,et al.  Tracking by an Optimal Sequence of Linear Predictors , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Selim Benhimane,et al.  Homography-based 2D Visual Tracking and Servoing , 2007, Int. J. Robotics Res..

[11]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[12]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[15]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[16]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Nassir Navab,et al.  A dataset and evaluation methodology for template-based tracking algorithms , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[18]  D J Heeger,et al.  Model for the extraction of image flow. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[19]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.