论文信息 - Enhancing the AR Experience with Machine Learning Services

Enhancing the AR Experience with Machine Learning Services

In this paper, we present and evaluate a web service that offers cloud-based machine learning services to improve Augmented Reality applications on mobile and web clients with special regards to tracking quality and registration of complex scenes that require an application-specific coordinate frame. Specifically, our service aims at reducing camera drift that still occurs in modern AR frameworks as well as helps with the initial camera alignment in a known scene by estimating the absolute camera pose using a configurable context-based image segmentation in combination with an adaptive image classification. We demonstrate real-world applications that utilize our web service and evaluate the performance and accuracy of the underlying image segmentation and the camera pose estimation. We also discuss the initial configuration along with the semi-automatic process of generating training data, and the training of the machine learning models for the corresponding tasks.

[1] Paul Grimm,et al. SMULGRAS: a platform for smart multicodal graphics search , 2017, Web3D.

[2] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3] Barbara Caputo,et al. Looking beyond appearances: Synthetic training data for deep CNNs in re-identification , 2017, Comput. Vis. Image Underst..

[4] Roberto Cipolla,et al. Convolutional networks for real-time 6-DOF camera relocalization , 2015, ArXiv.

[5] Roberto Cipolla,et al. Modelling uncertainty in deep learning for camera relocalization , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[6] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7] Giovanni De Magistris,et al. Transfer learning from synthetic to real images using variational autoencoders for robotic applications , 2017, ArXiv.

[8] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Ziyan Wu,et al. DepthSynth: Real-Time Realistic Synthetic Data Generation from CAD Models for 2.5D Recognition , 2017, 2017 International Conference on 3D Vision (3DV).

[11] Kate Saenko,et al. From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains , 2014, BMVC.

[12] Estefania Munoz Diaz. Inertial Pocket Navigation System: Unaided 3D Positioning , 2015, Sensors.

[13] Paul Grimm,et al. Improving mobile MR applications using a cloud-based image segmentation approach with synthetic training data , 2018, Web3D.

[14] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[15] Robert M. Haralick,et al. Review and analysis of solutions of the three point perspective pose estimation problem , 1994, International Journal of Computer Vision.

[16] Guoliang Chen,et al. Integrated WiFi/PDR/Smartphone Using an Unscented Kalman Filter Algorithm for 3D Indoor Localization , 2015, Sensors.

[17] Éric Marchand,et al. Pose Estimation for Augmented Reality: A Hands-On Survey , 2016, IEEE Transactions on Visualization and Computer Graphics.

[18] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Xiaolin Hu,et al. Delving deeper into convolutional neural networks for camera relocalization , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[20] Sisi Zlatanova,et al. Sensors for Indoor Mapping and Navigation , 2016, Sensors.

[21] Hannes Kaufmann,et al. HyMoTrack: A Mobile AR Navigation System for Complex Indoor Environments , 2015, Sensors.

[22] Rodrigo Munguía,et al. Human Collaborative Localization and Mapping in Indoor Environments with Non-Continuous Stereo , 2016, Sensors.