Head and Body Motion Prediction to Enable Mobile VR Experiences with Low Latency

As virtual reality (VR) applications become popular, the desire to enable high-quality, lightweight and mobile VR leads to various edge/cloud-based techniques. This paper introduces a predictive pre-rendering approach to address the ultra-low latency challenge in edge/cloud-based six Degrees of Freedom (6DoF) VR. Compared to 360-degree videos and 3DoF (head motion only) VR, 6DoF VR supports both head and body motions, thus not only viewing direction, but also viewing position changes. In our approach, the predictive view is rendered in advance based on the predicted viewing direction and position, leading to a reduction in latency. The key to achieving this efficient predictive pre-rendering approach is to predict the head and body motion accurately using past head and body motion traces. We develop a deep learning-based model and validate its ability using a dataset of over 840,000 samples for head and body motion.

[1]  Feng Qian,et al.  Optimizing 360 video delivery over cellular networks , 2016, ATC@MobiCom.

[2]  Wojciech Zaremba,et al.  Recurrent Neural Network Regularization , 2014, ArXiv.

[3]  Bruno Mannoni,et al.  A virtual museum , 1997, CACM.

[4]  Sujit Dey,et al.  Wireless VR/AR with Edge/Cloud Computing , 2017, 2017 26th International Conference on Computer Communication and Networks (ICCCN).

[5]  M. Arditti Unity , 1957, Journal of the Irish Medical Association.

[6]  Sofia Pescarin,et al.  Virtual Rome , 2008, SIGGRAPH '08.

[7]  Neil Davey,et al.  Time Series Prediction and Neural Networks , 2001, J. Intell. Robotic Syst..

[8]  Y. Charlie Hu,et al.  Furion: Engineering High-Quality Immersive Virtual Reality on Today's Mobile Devices , 2017, IEEE Transactions on Mobile Computing.

[9]  David Chu,et al.  FlashBack: Immersive Virtual Reality on Mobile Devices via Rendering Memoization , 2016, MobiSys.

[10]  Marco Gruteser,et al.  Cutting the Cord: Designing a High-quality Untethered VR System with Low Latency Remote Rendering , 2018, MobiSys.

[11]  Xin Liu,et al.  Shooting a moving target: Motion-prediction-based transmission for 360-degree videos , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[12]  Brendan Iribe Oculus Rift를 이용한 체감형 게임 구현 , 2014 .

[13]  Gang Wang,et al.  Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition , 2016, ECCV.

[14]  Danica Kragic,et al.  Deep Representation Learning for Human Motion Prediction and Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Michael J. Black,et al.  On Human Motion Prediction Using Recurrent Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  R. Schafer,et al.  What Is a Savitzky-Golay Filter? , 2022 .

[17]  Robert B. McGhee,et al.  Design, Implementation, and Experimental Results of a Quaternion-Based Kalman Filter for Human Body Motion Tracking , 2005, IEEE Transactions on Robotics.

[18]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Sujit Dey,et al.  Predictive View Generation to Enable Mobile 360-degree and VR Experiences , 2018, VR/AR Network@SIGCOMM.

[20]  Kuan-Ta Chen,et al.  Fixation Prediction for 360 ° Video Streaming to Head-Mounted Displays , 2017 .

[21]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.