论文信息 - PROMAR: Practical Reference Object-based Multi-user Augmented Reality

PROMAR: Practical Reference Object-based Multi-user Augmented Reality

Augmented reality (AR) is an emerging technology that weaves virtual objects into physical environments, and enables users to interact with them through viewing devices. This paper targets on multi-user AR applications, where virtual objects (VOs) placed by a user can be viewed by other users. We develop a practical framework that supports the basic multi-user AR functions of placing and viewing VOs, and our system can be deployed on off-the-shelf smartphones without special hardware. The main technical challenge we address is that when facing the exact same scene, the user who places the VO and the user who views the VO may have different view angles and distances to the scene. This setting is realistic and the traditional solutions yield a poor performance in terms of the accuracy. In this work, we have developed a suite of algorithms that help the viewers accurately identify the same scene and restore the VO under a moderate range of view angle difference. We have prototyped our system, and the experimental results have shown significant performance improvements. Our source codes and demos can be accessed at https://github.com/PROMAR2019.

[1] Silvio Savarese,et al. Learning a dense multi-view representation for detection, viewpoint classification and synthesis of object categories , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2] Tomás Pajdla,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Adrien Bartoli,et al. KAZE Features , 2012, ECCV.

[4] Michael Schmeing,et al. Color Segmentation Based Depth Image Filtering , 2012, WDIA.

[5] Roland Siegwart,et al. BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[6] Vincent Lepetit,et al. BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[7] Silvio Savarese,et al. Estimating the aspect layout of object categories , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9] Pieter Abbeel,et al. BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[10] Arnold W. M. Smeulders,et al. The Amsterdam Library of Object Images , 2004, International Journal of Computer Vision.

[11] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[12] B. Schiele,et al. Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[13] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[14] Alok Aggarwal,et al. Efficient, generalized indoor WiFi GraphSLAM , 2011, 2011 IEEE International Conference on Robotics and Automation.

[15] Gary R. Bradski,et al. ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[16] Moustafa Youssef,et al. SemanticSLAM: Using Environment Landmarks for Unsupervised Indoor Localization , 2016, IEEE Transactions on Mobile Computing.

[17] Gi-Wan Yoon,et al. Building a Practical Wi-Fi-Based Indoor Navigation System , 2014, IEEE Pervasive Computing.

[18] A. Oliva,et al. Diagnostic Colors Mediate Scene Recognition , 2000, Cognitive Psychology.

[19] Guangming Shi,et al. Structure guided fusion for depth map inpainting , 2013, Pattern Recognit. Lett..

[20] Tom Drummond,et al. Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[21] Xiaoqiang Lu,et al. Scene Recognition by Manifold Regularized Deep Learning Architecture , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[22] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23] NICHOLAS D. LANE,et al. Gain Without Pain: Accurate WiFi-based Localization using Fingerprint Spatial Gradient , 2017 .

[24] Yiqiang Chen,et al. Power-efficient access-point selection for indoor location estimation , 2006, IEEE Transactions on Knowledge and Data Engineering.

[25] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[26] S. Burak Gokturk,et al. A Time-Of-Flight Depth Sensor - System Description, Issues and Solutions , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[27] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[28] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30] Bolei Zhou,et al. Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[31] Ling Shao,et al. Enhanced Computer Vision With Microsoft Kinect Sensor: A Review , 2013, IEEE Transactions on Cybernetics.

[32] Xiaofeng Ren,et al. Discriminative Mixture-of-Templates for Viewpoint Classification , 2010, ECCV.

[33] Neil D. Lawrence,et al. WiFi-SLAM Using Gaussian Process Latent Variable Models , 2007, IJCAI.

[34] Julien Rabin,et al. An Analysis of the SURF Method , 2015, Image Process. Line.

[35] Simone Milani,et al. Joint denoising and interpolation of depth maps for MS Kinect sensors , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.