论文信息 - Real-Time Camera Tracking: When is High Frame-Rate Best?

Real-Time Camera Tracking: When is High Frame-Rate Best?

Higher frame-rates promise better tracking of rapid motion, but advanced real-time vision systems rarely exceed the standard 10–60Hz range, arguing that the computation required would be too great. Actually, increasing frame-rate is mitigated by reduced computational cost per frame in trackers which take advantage of prediction. Additionally, when we consider the physics of image formation, high frame-rate implies that the upper bound on shutter time is reduced, leading to less motion blur but more noise. So, putting these factors together, how are application-dependent performance requirements of accuracy, robustness and computational cost optimised as frame-rate varies? Using 3D camera tracking as our test problem, and analysing a fundamental dense whole image alignment approach, we open up a route to a systematic investigation via the careful synthesis of photorealistic video using ray-tracing of a detailed 3D scene, experimentally obtained photometric response and noise models, and rapid camera motions. Our multi-frame-rate, multi-resolution, multi-light-level dataset is based on tens of thousands of hours of CPU rendering time. Our experiments lead to quantitative conclusions about frame-rate selection and highlight the crucial role of full consideration of physical image formation in pushing tracking performance.

[1] Richard Szeliski,et al. Noise Estimation from a Single Image , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2] Tobias Pietzsch,et al. A Framework For Evaluating Visual SLAM , 2009, BMVC.

[3] Andrew J. Davison,et al. DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[4] Vincent Lepetit,et al. Compact signatures for high-speed interest point description and matching , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5] Frédo Durand,et al. Noise-optimal capture for high dynamic range photography , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[7] Kiriakos N. Kutulakos,et al. Time-constrained photography , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8] Jitendra Malik,et al. Recovering high dynamic range radiance maps from photographs , 1997, SIGGRAPH '08.

[9] Patrick Rives,et al. Accurate Quadrifocal Tracking for Robust 3D Visual Odometry , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10] Wolfram Burgard,et al. Towards a benchmark for RGB-D SLAM evaluation , 2011, RSS 2011.

[11] Masatoshi Ishikawa,et al. 1 ms column parallel vision system and its application of high speed target tracking , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[12] G. Klein,et al. Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[13] Patrick Rives,et al. A spherical robot-centered representation for urban navigation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14] Geoffrey E. Hinton,et al. Learning Generative Texture Models with extended Fields-of-Experts , 2009, BMVC.

[15] Andrew W. Fitzgibbon,et al. KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[16] Andrew J. Davison,et al. Active Matching , 2008, ECCV.

[17] David W. Murray,et al. Simulating Low-Cost Cameras for Augmented Reality Compositing , 2010, IEEE Transactions on Visualization and Computer Graphics.

[18] Vincent Lepetit,et al. Pareto-optimal dictionaries for signatures , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.