BodySLAM: Opportunistic User Digitization in Multi-User AR/VR Experiences

Today’s augmented and virtual reality (AR/VR) systems do not provide body, hand or mouth tracking without special worn sensors or external infrastructure. Simultaneously, AR/VR systems are increasingly being used in co-located, multi-user experiences, opening the possibility for opportunistic capture of other users. This is the core idea behind BodySLAM, which uses disparate camera views from users to digitize the body, hands and mouth of other people, and then relay that information back to the respective users. If a user is seen by two or more people, 3D pose can be estimated via stereo reconstruction. Our system also maps the arrangement of users in real world coordinates. Our approach requires no additional hardware or sensors beyond what is already found in commercial AR/VR devices, such as Microsoft HoloLens or Oculus Quest.

[1]  Loren G. Terveen,et al.  The sound of one hand: a wrist-mounted bio-acoustic fingertip gesture interface , 2002, CHI Extended Abstracts.

[2]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[3]  Chris Harrison,et al.  EyeSpyVR: Interactive Eye Sensing Using Off-the-Shelf, Smartphone-Based VR Headsets , 2018, Proc. ACM Interact. Mob. Wearable Ubiquitous Technol..

[4]  Jonathan Tompson,et al.  PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model , 2018, ECCV.

[5]  Hans-Peter Seidel,et al.  VNect , 2017, ACM Trans. Graph..

[6]  Hans-Peter Seidel,et al.  EgoCap , 2016, ACM Trans. Graph..

[7]  Ilya Kostrikov,et al.  Depth Sweep Regression Forests for Estimating 3D Human Pose from Images , 2014, BMVC.

[8]  Alexandru Tupan,et al.  Triangulation , 1997, Comput. Vis. Image Underst..

[9]  Chuan Li,et al.  Approximate Translational Building Blocks for Image Decomposition and Synthesis , 2015, ACM Trans. Graph..

[10]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Justus Thies,et al.  Demo of FaceVR: real-time facial reenactment and eye gaze control in virtual reality , 2016, SIGGRAPH Emerging Technologies.

[12]  S. Sachdeva,et al.  Fitzpatrick skin typing: applications in dermatology. , 2009, Indian journal of dermatology, venereology and leprology.

[13]  Steven K. Feiner,et al.  Collaborative mixed reality visualization of an archaeological excavation , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[14]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[15]  Tomás Pajdla,et al.  3D with Kinect , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[16]  Dieter Schmalstieg,et al.  The Studierstube Augmented Reality Project , 2002, Presence: Teleoperators & Virtual Environments.

[17]  Pascal Fua,et al.  Mo2Cap2: Real-time Mobile 3D Motion Capture with a Cap-mounted Fisheye Camera , 2018, IEEE Transactions on Visualization and Computer Graphics.

[18]  Ivan Poupyrev,et al.  Mixing realities in Shared Space: an augmented reality interface for collaborative computing , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[19]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[20]  Gierad Laput,et al.  AuraSense: Enabling Expressive Around-Smartwatch Interactions with Electric Field Sensing , 2016, UIST.

[21]  Cheng Li,et al.  Pixel-Level Hand Detection in Ego-centric Videos , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Liang Zheng,et al.  Improving Person Re-identification by Attribute and Identity Learning , 2017, Pattern Recognit..

[23]  Yang Zhang,et al.  Tomo: Wearable, Low-Cost Electrical Impedance Tomography for Hand Gesture Recognition , 2015, UIST.

[24]  Ivan Poupyrev,et al.  SideBySide: ad-hoc multi-user interaction with handheld projectors , 2011, UIST.

[25]  Robert Xiao,et al.  MeCap: Whole-Body Digitization for Low-Cost VR/AR Headsets , 2019, UIST.

[26]  覃政 Google Cardboard:伟大的搅局者 , 2014 .

[27]  Yang Zhang,et al.  Wall++: Room-Scale Interactive and Context-Aware Sensing , 2018, CHI.

[28]  Christian Sandor,et al.  Towards large scale high fidelity collaborative augmented reality , 2019, Comput. Graph..

[29]  Chongyang Ma,et al.  Facial performance sensing head-mounted display , 2015, ACM Trans. Graph..

[30]  Lourdes Agapito,et al.  xR-EgoPose: Egocentric 3D Human Pose From an HMD Camera , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  R. Hartley Triangulation, Computer Vision and Image Understanding , 1997 .

[32]  Fan Zhang,et al.  BlazePose: On-device Real-time Body Pose tracking , 2020, ArXiv.

[33]  Noel E. O'Connor,et al.  Evaluating a dancer's performance using kinect-based skeleton tracking , 2011, ACM Multimedia.

[34]  Ke Huo,et al.  SynchronizAR: Instant Synchronization for Spontaneous and Spatial Collaborations in Augmented Reality , 2018, UIST.

[35]  Hugh F. Durrant-Whyte,et al.  A solution to the simultaneous localization and map building (SLAM) problem , 2001, IEEE Trans. Robotics Autom..

[36]  Yaser Sheikh,et al.  Motion capture from body-mounted cameras , 2011, ACM Trans. Graph..

[37]  Antonio Torralba,et al.  Through-Wall Human Pose Estimation Using Radio Signals , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Patrick Baudisch,et al.  Imaginary reality gaming: ball games without a ball , 2013, UIST.

[39]  Huchuan Lu,et al.  Pose-Invariant Embedding for Deep Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[40]  Brendan Iribe Oculus Rift를 이용한 체감형 게임 구현 , 2014 .

[41]  Sehoon Ha,et al.  Human motion reconstruction from force sensors , 2011, SCA '11.

[42]  Patrick Olivier,et al.  Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor , 2012, UIST.