Depth Aware Finger Tapping on Virtual Displays

For AR/VR systems, tapping-in-the-air is a user-friendly solution for interactions. Most prior in-air tapping schemes use customized depth-cameras and therefore have the limitations of low accuracy and high latency. In this paper, we propose a fine-grained depth-aware tapping scheme that can provide high accuracy tapping detection. Our basic idea is to use light-weight ultrasound based sensing, along with one COTS mono-camera, to enable 3D tracking of user's fingers. The mono-camera is used to track user's fingers in the 2D space and ultrasound based sensing is used to get the depth information of user's fingers in the 3D space. Using speakers and microphones that already exist on most AR/VR devices, we emit ultrasound, which is inaudible to humans, and capture the signal reflected by the finger with the microphone. From the phase changes of the ultrasound signal, we accurately measure small finger movements in the depth direction. With fast and light-weight ultrasound signal processing algorithms, our scheme can accurately track finger movements and measure the bending angle of the finger between two video frames. In our experiments on eight users, our scheme achieves a 98.4% finger tapping detection accuracy with FPR of 1.6% and FNR of 1.4%, and a detection latency of 17.69ms, which is 57.7ms less than video-only schemes. The power consumption overhead of our scheme is 48.4% more than video-only schemes.

[1]  Bo Chen,et al.  Tracking Keystrokes Using Wireless Signals , 2015, MobiSys.

[2]  Desney S. Tan,et al.  FingerIO: Using Active Sonar for Fine-Grained Finger Tracking , 2016, CHI.

[3]  Fan Song-qing Measurement of Fingers , 2002 .

[4]  Xiang 'Anthony' Chen,et al.  Air+touch: interweaving touch & in-air gestures , 2014, UIST.

[5]  Rajesh P. N. Rao Brain-Computer Interfacing: An Introduction , 2010 .

[6]  Frederick P. Brooks,et al.  Moving objects in space: exploiting proprioception in virtual-environment interaction , 1997, SIGGRAPH.

[7]  Otmar Hilliges,et al.  In-air gestures around unmodified mobile devices , 2014, UIST.

[8]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[9]  I.,et al.  Fitts' Law as a Research and Design Tool in Human-Computer Interaction , 1992, Hum. Comput. Interact..

[10]  Yuanchun Shi,et al.  ATK: Enabling Ten-Finger Freehand Typing in Air Based on 3D Hand Tracking Data , 2015, UIST.

[11]  Wenyuan Xu,et al.  DolphinAttack: Inaudible Voice Commands , 2017, CCS.

[12]  Sangki Yun,et al.  Strata: Fine-Grained Acoustic-based Device-Free Tracking , 2017, MobiSys.

[13]  Xinyu Zhang,et al.  mTrack: High-Precision Passive Tracking Using Millimeter Wave Radios , 2015, MobiCom.

[14]  艾而帝,et al.  Microsoft Kinect 虛擬復健系統設計 , 2013 .

[15]  David Salesin,et al.  Gaze-based interaction for semi-automatic photo cropping , 2006, CHI.

[16]  Frank Weichert,et al.  Analysis of the Accuracy and Robustness of the Leap Motion Controller , 2013, Sensors.

[17]  Qun Li,et al.  CamK: A camera-based keyboard for small mobile devices , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[18]  Parameswaran Ramanathan,et al.  Leveraging directional antenna capabilities for fine-grained gesture recognition , 2014, UbiComp.

[19]  Chi Thanh Vi,et al.  Agency in Mid-air Interfaces , 2017, CHI.

[20]  Shyamnath Gollakota,et al.  Bringing Gesture Recognition to All Devices , 2014, NSDI.

[21]  Wei Wang,et al.  Keystroke Recognition Using WiFi Signals , 2015, MobiCom.

[22]  Jue Wang,et al.  RF-IDraw: virtual touch screen in the air using RF signals , 2015, SIGCOMM 2015.

[23]  Khaled A. Harras,et al.  WiGest: A ubiquitous WiFi-based gesture recognition system , 2014, 2015 IEEE Conference on Computer Communications (INFOCOM).

[24]  Wei Wang,et al.  Device-free gesture tracking using acoustic signals , 2016, MobiCom.

[25]  Lei Yang,et al.  Accurate online power estimation and automatic battery behavior based power model generation for smartphones , 2010, 2010 IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[26]  Desney S. Tan,et al.  SoundWave: using the doppler effect to sense gestures , 2012, CHI.

[27]  Robert B. Miller,et al.  Response time in man-computer conversational transactions , 1899, AFIPS Fall Joint Computing Conference.

[28]  Claire C. Gordon,et al.  2010 Anthropometric Survey of U.S. Marine Corps Personnel: Methods and Summary Statistics , 2013 .

[29]  Jie Yang,et al.  Snooping Keystrokes with mm-level Audio Ranging on a Single Phone , 2015, MobiCom.

[30]  Jie Yang,et al.  E-eyes: device-free location-oriented activity identification using fine-grained WiFi signatures , 2014, MobiCom.

[31]  Kourosh Khoshelham,et al.  Accuracy analysis of kinect depth data , 2012 .

[32]  Chi Zhang,et al.  Extending Mobile Interaction Through Near-Field Visible Light Sensing , 2015, MobiCom.

[33]  Siyu Jiang,et al.  Whole-home gesture recognition using wireless signals (demo) , 2013, SIGCOMM.

[34]  Xinyu Zhang,et al.  Ubiquitous keyboard for small mobile devices: harnessing multipath fading for fine-grained keystroke localization , 2014, MobiSys.

[35]  Takashi Komuro,et al.  Multi-finger AR Typing Interface for Mobile Devices Using High-Speed Hand Motion Recognition , 2015, IUI Companion '14.

[36]  Allen G. Taylor,et al.  What Is the Microsoft HoloLens , 2016 .

[37]  Pierre Graebling,et al.  Robust Structured Light Coding for 3D Reconstruction , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[38]  Andreas Kolb,et al.  Kinect range sensing: Structured-light versus Time-of-Flight Kinect , 2015, Comput. Vis. Image Underst..

[39]  Gunilla Borgefors,et al.  Distance transformations in digital images , 1986, Comput. Vis. Graph. Image Process..

[40]  Abdesselam Bouzerdoum,et al.  Skin segmentation using color pixel classification: analysis and comparison , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Katsuhito Fujimoto,et al.  Gesture keyboard requiring only one camera , 2011, UIST '11 Adjunct.

[42]  Sebastian Thrun,et al.  3D shape scanning with a time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Shwetak N. Patel,et al.  AirLink: sharing files between multiple devices using in-air gestures , 2014, UbiComp.