HyperSight: boosting distant 3D vision on a single dual-camera smartphone

Smartphones with dual cameras are increasingly popular due to the need of supporting 3D vision. The depth information is critical for 3D vision. However, the two cameras on a smartphone are too close to accurately estimate the depth information especially for objects beyond two meters. In this paper, we propose an innovative system, called HyperSight, to estimate the depth information of objects using a dual camera smartphone. HyperSight realizes a virtual longbaseline stereo vision rig by having a user to move the phone in the air. The phone movement is continuously tracked and estimated using the short-baseline dual camera seeing nearby objects. We implement HyperSight as software on a Commercial-Off-The-Shelf (COTS) smartphone and conduct real-world experiments. The results show that when measuring feature-rich objects at a distance of five meters, HyperSight achieves a mean depth error of 6cm, which is up to 10× and 18× improvement in the accuracy compared with the stereo vision system using the native dual cameras and the Measure app based on ARKit 1 on mobile devices, respectively.

[1]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[2]  Daniel Cremers,et al.  Dense visual SLAM for RGB-D cameras , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Gary R. Bradski,et al.  Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library , 2016 .

[4]  Takeo Kanade,et al.  A multiple-baseline stereo , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Alex Zelinsky,et al.  Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.

[6]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[7]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[8]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[9]  Guy-Richard Kayombya,et al.  SIFT feature extraction on a Smartphone GPU using OpenGL ES2.0 , 2010 .

[10]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[11]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[12]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[13]  Minglu Li,et al.  SenSpeed: Sensing Driving Conditions to Estimate Vehicle Speed in Urban Environments , 2014, IEEE Transactions on Mobile Computing.

[14]  Russ Tedrake,et al.  Pushbroom stereo for high-speed navigation in cluttered environments , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[16]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[17]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Naokazu Yokoya,et al.  Generation of high-resolution stereo panoramic images by omnidirectional imaging sensor using hexagonal pyramidal mirrors , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[19]  Simon Lacroix,et al.  Vision-Based SLAM: Stereo and Monocular Approaches , 2007, International Journal of Computer Vision.

[20]  Larry H. Matthies,et al.  Robust and Efficient Stereo Feature Tracking for Visual Odometry , 2008, 2008 IEEE International Conference on Robotics and Automation.

[21]  Sebastian Thrun,et al.  3D shape scanning with a time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.