3D Gestural Interaction: The State of the Field

3D gestural interaction provides a powerful and natural way to interact with computers using the hands and body for a variety of different applications including video games, training and simulation, and medicine. However, accurately recognizing 3D gestures so that they can be reliably used in these applications poses many different research challenges. In this paper, we examine the state of the field of 3D gestural interfaces by presenting the latest strategies on how to collect the raw 3D gesture data from the user and how to accurately analyze this raw data to correctly recognize 3D gestures users perform. In addition, we examine the latest in 3D gesture recognition performance in terms of accuracy and gesture set size and discuss how different applications are making use of 3D gestural interaction. Finally, we present ideas for future research in this thriving and active research area.

[1]  Daqing Zhang,et al.  Gesture Recognition with a 3-D Accelerometer , 2009, UIC.

[2]  Joseph J. LaViola,et al.  Full Body Locomotion with Video Game Motion Controllers , 2013 .

[3]  Philip Kortum,et al.  HCI Beyond the GUI: Design for Haptic, Speech, Olfactory, and Other Nontraditional Interfaces , 2008 .

[4]  Christian Wöhler,et al.  3D Computer Vision , 2012 .

[5]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Desney S. Tan,et al.  Enabling always-available input with muscle-computer interfaces , 2009, UIST '09.

[7]  Joseph J. LaViola,et al.  CavePainting: a fully immersive 3D artistic medium and interactive experience , 2001, I3D '01.

[8]  Margherita Antona,et al.  Universal Access in Human-Computer Interaction. Design Methods, Tools, and Interaction Techniques for eInclusion , 2013, Lecture Notes in Computer Science.

[9]  Colin Ware,et al.  Using the bat: a six-dimensional mouse for object placement , 1988, IEEE Computer Graphics and Applications.

[10]  Desney S. Tan,et al.  Demonstrating the feasibility of using forearm electromyography for muscle-computer interfaces , 2008, CHI.

[11]  Augustine Tsai,et al.  Hand posture recognition using Hidden Conditional Random Fields , 2009, 2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics.

[12]  Thomas S. Huang,et al.  Gesture modeling and recognition using finite state machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[13]  Kent Lyons,et al.  The Gesture Watch: A Wireless Contact-free Gesture based Wrist Interface , 2007, 2007 11th IEEE International Symposium on Wearable Computers.

[14]  Jovan Popović,et al.  Real-time hand-tracking with a color glove , 2009, SIGGRAPH 2009.

[15]  Fengming Zhang,et al.  Hand Gesture Recognition Based on MEB-SVM , 2009, 2009 International Conference on Embedded Software and Systems.

[16]  Ivan E. Sutherland,et al.  The Ultimate Display , 1965 .

[17]  Luca Benini,et al.  Hidden Markov Model based gesture recognition on low-cost, low-power Tangible User Interfaces , 2009, Entertain. Comput..

[18]  Yang Li,et al.  Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes , 2007, UIST.

[19]  Keechul Jung,et al.  Recognition-based gesture spotting in video games , 2004, Pattern Recognit. Lett..

[20]  Zhang Peng,et al.  An Automatic Hand Gesture Recognition System Based on Viola-Jones Method and SVMs , 2009, 2009 Second International Workshop on Computer Science and Engineering.

[21]  Yale Song,et al.  Multi-signal gesture recognition using temporal smoothing hidden conditional random fields , 2011, Face and Gesture 2011.

[22]  Jinxiang Chai,et al.  Accurate realtime full-body motion capture using a single depth camera , 2012, ACM Trans. Graph..

[23]  Andries van Dam,et al.  Post-WIMP user interfaces , 1997, CACM.

[24]  Ching Tang Hsieh,et al.  A Real Time Hand Gesture Recognition System Based on DFT and SVM , 2012, 2012 8th International Conference on Information Science and Digital Content Technology (ICIDT2012).

[25]  Wu-Chih Hu,et al.  Vision-Based Hand Gesture Recognition Using PCA+Gabor Filters and SVM , 2009, 2009 Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing.

[26]  Andrew D. Wilson Sensor- and Recognition-Based Input for Interaction , 2009 .

[27]  Zhenyu He,et al.  Gesture recognition based on 3D accelerometer for cell phones interaction , 2008, APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems.

[28]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[29]  Tim Roberts,et al.  Natural Full Body Interaction for Navigation in Dismounted Soldier Training , 2011 .

[30]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[31]  Frédéric Lerasle,et al.  Two-handed gesture recognition and fusion with speech to command a robot , 2012, Auton. Robots.

[32]  Sang-Heon Lee,et al.  Smart TV interaction system using face and hand gesture recognition , 2013, 2013 IEEE International Conference on Consumer Electronics (ICCE).

[33]  Pattie Maes,et al.  SixthSense: a wearable gestural interface , 2009, SIGGRAPH ASIA Art Gallery & Emerging Technologies.

[34]  Michael Rohs,et al.  The $3 recognizer: simple 3D gesture recognition on mobile devices , 2010, IUI '10.

[35]  Seong-Whan Lee,et al.  Simultaneous spotting of signs and fingerspellings based on hierarchical conditional random fields and boostmap embeddings , 2010, Pattern Recognit..

[36]  Joseph J. LaViola,et al.  Breaking the status quo: Improving 3D gesture recognition with spatially convenient input devices , 2010, 2010 IEEE Virtual Reality Conference (VR).

[37]  Z. Zenn Bien,et al.  User adaptive hand gesture recognition using multivariate fuzzy decision tree and fuzzy garbage model , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[38]  Ayoub Al-Hamadi,et al.  Discriminative Models-Based Hand Gesture Recognition , 2009, 2009 Second International Conference on Machine Vision.

[39]  Daniel Thalmann,et al.  3D fingertip and palm tracking in depth image sequences , 2012, ACM Multimedia.

[40]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[41]  Bastian Leibe,et al.  MIND-WARPING: towards creating a compelling collaborative augmented reality game , 2000, IUI '00.

[42]  Emiko Charbonneau,et al.  The Wiimote and Beyond: Spatially Convenient Devices for 3D User Interfaces , 2010, IEEE Computer Graphics and Applications.

[43]  Dong Han,et al.  Selection and context for action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[44]  Ayoub Al-Hamadi,et al.  Robust methods for hand gesture spotting and recognition using Hidden Markov Models and Conditional Random Fields , 2010, The 10th IEEE International Symposium on Signal Processing and Information Technology.

[45]  Joseph J. LaViola,et al.  Exploring strategies and guidelines for developing full body video game interfaces , 2010, FDG.

[46]  Yael Edan,et al.  Vision-based hand-gesture applications , 2011, Commun. ACM.

[47]  Rod McCall,et al.  Lightweight palm and finger tracking for real-time 3D gesture control , 2011, 2011 IEEE Virtual Reality Conference.

[48]  Sylvain Paris,et al.  6D hands: markerless hand-tracking for computer aided design , 2011, UIST.

[49]  Joseph J. LaViola,et al.  3D Gesture classification with linear acceleration and angular velocity sensing devices for video games , 2013, Entertain. Comput..

[50]  Nassir Navab,et al.  Learning Gestures for Customizable Human-Computer Interaction in the Operating Room , 2011, MICCAI.

[51]  Shin'ichi Satoh,et al.  Human gesture recognition system for TV viewing using time-of-flight camera , 2011, Multimedia Tools and Applications.

[52]  Yael Edan,et al.  Designing Hand Gesture Vocabularies for Natural Interaction by Combining Psycho-Physiological and Recognition Factors , 2008, Int. J. Semantic Comput..

[53]  Matthias Baldauf,et al.  A survey on context-aware systems , 2007, Int. J. Ad Hoc Ubiquitous Comput..

[54]  Desney S. Tan,et al.  SoundWave: using the doppler effect to sense gestures , 2012, CHI.

[55]  Shwetak N. Patel,et al.  LightWave: using compact fluorescent lights as sensors , 2011, UbiComp '11.

[56]  Serban Oprisescu,et al.  Automatic static hand gesture recognition using ToF cameras , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[57]  Youdong Ding,et al.  Dynamic Hand Gesture Recognition Using Kinematic Features Based on Hidden Markov Model , 2013 .

[58]  Irfan Essa,et al.  Object Spaces: Context Management for Human Activity Recognition , 1998 .

[59]  Chung-Lin Huang,et al.  Hand gesture recognition using a real-time tracking method and hidden Markov models , 2003, Image Vis. Comput..

[60]  Nassir Navab,et al.  An adaptive solution for intra-operative gesture-based human-machine interaction , 2012, IUI '12.

[61]  Nassir Navab,et al.  Human skeleton tracking from depth data using geodesic distances and optical flow , 2012, Image Vis. Comput..

[62]  Agnès Just,et al.  A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition , 2009, Comput. Vis. Image Underst..

[63]  Michael Rohs,et al.  Protractor3D: a closed-form solution to rotation-invariant 3D gestures , 2011, IUI '11.

[64]  Irfan A. Essa,et al.  Exploiting human actions and object context for recognition tasks , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[65]  Emiko Charbonneau,et al.  Understanding visual interfaces for the next generation of dance-based rhythm video games , 2009, SIGGRAPH 2009.

[66]  Desney S. Tan,et al.  Humantenna: using the body as an antenna for real-time whole-body interaction , 2012, CHI.

[67]  Kanad K. Biswas,et al.  Gesture recognition using Microsoft Kinect® , 2011, The 5th International Conference on Automation, Robotics and Applications.

[68]  Xia Liu,et al.  Hand gesture recognition using depth data , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[69]  Ayoub Al-Hamadi,et al.  LDCRFs-based hand gesture recognition , 2012, 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[70]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[71]  Xin Zhang,et al.  Real-time fingertip tracking and detection using Kinect depth sensor for a new writing-in-the air system , 2012, ICIMCS '12.

[72]  Chris Harrison,et al.  OmniTouch: wearable multitouch interaction everywhere , 2011, UIST.

[73]  Desney S. Tan,et al.  Your noise is my command: sensing gestures using the body as an antenna , 2011, CHI.

[74]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[75]  Jürgen Beyerer,et al.  Robust Hand Tracking in Realtime Using a Single Head-Mounted RGB Camera , 2013, HCI.

[76]  Joseph J. LaViola,et al.  Exploring the Trade-off Between Accuracy and Observational Latency in Action Recognition , 2013, International Journal of Computer Vision.

[77]  Yunde Jia,et al.  Human Action Recognition Using Manifold Learning and Hidden Conditional Random Fields , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[78]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[79]  Michael Rohs,et al.  A $3 gesture recognizer: simple gesture recognition for devices equipped with 3D acceleration sensors , 2010, IUI '10.

[80]  Kyle Johnsen,et al.  Immersive 3DUI on one dollar a day , 2012, 2012 IEEE Symposium on 3D User Interfaces (3DUI).

[81]  Muhammad Younus Javed,et al.  A statistical feature based decision tree approach for hand gesture recognition , 2009, FIT.

[82]  Mario Fernando Montenegro Campos,et al.  Real-Time Gesture Recognition from Depth Data through Key Poses Learning and Decision Forests , 2012, 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images.

[83]  Joseph J. LaViola,et al.  Measuring and reducing observational latency when recognizing actions , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[84]  Andreas Riener Gestural Interaction in Vehicular Applications , 2012, Computer.

[85]  Sebastian Thrun,et al.  Real time motion capture using a single time-of-flight camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[86]  Ayoub Al-Hamadi,et al.  A framework for the integration of gesture and posture recognition using HMM and SVM , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[87]  Yen-Ting Chen,et al.  Multiple-angle Hand Gesture Recognition by Fusing SVM Classifiers , 2007, 2007 IEEE International Conference on Automation Science and Engineering.

[88]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[89]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[90]  Daniel Kelly,et al.  Recognition of Spatiotemporal Gestures in Sign Language Using Gesture Threshold HMMs , 2011 .

[91]  Sungyoung Lee,et al.  Two-stage Hidden Markov Model in Gesture Recognition for Human Robot Interaction , 2012 .

[92]  Desney S. Tan,et al.  An ultra-low-power human body motion sensor using static electric field sensing , 2012, UbiComp.

[93]  Anthony Whitehead,et al.  Device agnostic 3D gesture recognition using hidden Markov models , 2009, Future Play.

[94]  Artzai Picón,et al.  Robust vision-based hand tracking using single camera for ubiquitous 3D gesture interaction , 2010, 2010 IEEE Symposium on 3D User Interfaces (3DUI).

[95]  Dirk Schulz,et al.  Real time interaction with mobile robots using hand gestures , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[96]  Aamna Saeed,et al.  An extensive survey of context-aware middleware architectures , 2010, 2010 IEEE International Conference on Electro/Information Technology.

[97]  Joseph J. LaViola,et al.  Wizard of Wii: toward understanding player experience in first person games with 3D gestures , 2011, FDG.

[98]  Zhi Li,et al.  Real time Hand Gesture Recognition using a Range Camera , 2009, ICRA 2009.

[99]  Seung-Hwan Choi,et al.  3D-Position Estimation for Hand Gesture Interface Using a Single Camera , 2011, HCI.

[100]  Wei Li,et al.  Gesture recognition based on Hidden Markov Model from sparse representative observations , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[101]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[102]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[103]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[104]  Chen-Chiung Hsieh,et al.  Novel Haar features for real-time hand gesture recognition using SVM , 2012, Journal of Real-Time Image Processing.

[105]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[106]  Helena M. Mentis,et al.  Instructing people for training gestural interactive systems , 2012, CHI.

[107]  Luc Van Gool,et al.  Real-time 3D hand gesture interaction with a robot for understanding directions from humans , 2011, 2011 RO-MAN.

[108]  Hyunsook Chung,et al.  Conditional random field-based gesture recognition with depth information , 2013 .

[109]  Panos E. Trahanias,et al.  Gesture recognition based on arm tracking for human-robot interaction , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[110]  Kongqiao Wang,et al.  A Framework for Hand Gesture Recognition Based on Accelerometer and EMG Sensors , 2011, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[111]  Paolo Dario,et al.  A Survey of Glove-Based Systems and Their Applications , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[112]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[113]  Desney S. Tan,et al.  Making muscle-computer interfaces more practical , 2010, CHI.

[114]  Lale Akarun,et al.  Hand Pose Estimation and Hand Shape Classification Using Multi-layered Randomized Decision Forests , 2012, ECCV.

[115]  Robin R. Murphy,et al.  Hand gesture recognition with depth images: A review , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[116]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[117]  Nicolas D. Georganas,et al.  Real-Time Hand Gesture Detection and Recognition Using Bag-of-Features and Support Vector Machine Techniques , 2011, IEEE Transactions on Instrumentation and Measurement.

[118]  Trevor Darrell,et al.  Latent-Dynamic Discriminative Models for Continuous Gesture Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[119]  Ivan Poupyrev,et al.  An Introduction to 3-D User Interface Design , 2001, Presence: Teleoperators & Virtual Environments.

[120]  Lale Akarun,et al.  Real time hand pose estimation using depth sensors , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[121]  Desney S. Tan,et al.  Skinput: appropriating the body as an input surface , 2010, CHI.

[122]  Dean Rubine,et al.  Specifying gestures by example , 1991, SIGGRAPH.

[123]  Greg Welch,et al.  Motion Tracking: No Silver Bullet, but a Respectable Arsenal , 2002, IEEE Computer Graphics and Applications.

[124]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[125]  Yale Song,et al.  Continuous body and hand gesture recognition for natural human-computer interaction , 2012, TIIS.

[126]  Joseph J. LaViola,et al.  Exploring 3D gestural interfaces for music creation in video games , 2009, FDG.

[127]  Paul Anderson,et al.  Gameplay issues in the design of spatial 3D gestures for video games. , 2006, CHI EA '06.

[128]  Michael Rohs,et al.  Combining acceleration and gyroscope data for motion gesture recognition using classifiers with dimensionality constraints , 2013, IUI '13.

[129]  Michael Rohs,et al.  ShoeSense: a new perspective on gestural interaction and wearable applications , 2012, CHI.

[130]  Shumin Zhai,et al.  Foundational Issues in Touch-Surface Stroke Gesture Design - An Integrative Review , 2012, Found. Trends Hum. Comput. Interact..

[131]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[132]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[133]  Andrzej Czyzewski,et al.  Examining Classifiers Applied to Static Hand Gesture Recognition in Novel Sound Mixing System , 2012, MISSI.

[134]  Maribeth Gandy Coleman,et al.  The Gesture Pendant: A Self-illuminating, Wearable, Infrared Computer Vision System for Home Automation Control and Medical Monitoring , 2000, Digest of Papers. Fourth International Symposium on Wearable Computers.

[135]  Tadashi Kitamura,et al.  Subunit Modeling for Japanese Sign Language Recognition Based on Phonetically Depend Multi-stream Hidden Markov Models , 2013, HCI.

[136]  Jr. Joseph J. LaViola,et al.  A Survey of Hand Posture and Gesture Recognition Techniques and Technology , 1999 .

[137]  Rini Akmeliawati,et al.  Hidden Markov model for human to computer interaction: a study on human hand gesture recognition , 2011, Artificial Intelligence Review.

[138]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[139]  Mauro dos Santos Anjo,et al.  Fingerspelling Recognition with Support Vector Machines and Hidden Conditional Random Fields - A Comparison with Neural Networks and Hidden Markov Models , 2012, IBERAMIA.

[140]  Aytül Erçil,et al.  A Decision Forest Based Feature Selection Framework for Action Recognition from RGB-Depth Cameras , 2013, ICIAR.

[141]  Yang Li,et al.  Protractor: a fast and accurate gesture recognizer , 2010, CHI.

[142]  Joseph J. LaViola,et al.  Pop through button devices for VE navigation and interaction , 2002, Proceedings IEEE Virtual Reality 2002.

[143]  Norbert Schnell,et al.  Continuous Realtime Gesture Following and Recognition , 2009, Gesture Workshop.

[144]  Wen Gao,et al.  Large vocabulary sign language recognition based on hierarchical decision trees , 2003, ICMI '03.

[145]  Joseph J. LaViola,et al.  Exploring 3d gesture metaphors for interaction with unmanned aerial vehicles , 2013, IUI '13.

[146]  James R. Glass,et al.  Fast spoken query detection using lower-bound Dynamic Time Warping on Graphical Processing Units , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[147]  Junsong Yuan,et al.  Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera , 2011, ACM Multimedia.

[148]  Sarajane Marques Peres,et al.  Gesture unit segmentation using support vector machines: segmenting gestures from rest positions , 2013, SAC '13.

[149]  Sung-Bae Cho,et al.  Activity Recognition Using Hierarchical Hidden Markov Models on a Smartphone with 3D Accelerometer , 2011, HAIS.

[150]  Kouichi Murakami,et al.  Gesture recognition using recurrent neural networks , 1991, CHI.

[151]  Patrick Olivier,et al.  Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor , 2012, UIST.

[152]  J.K. Aggarwal,et al.  Human activity analysis , 2011, ACM Comput. Surv..

[153]  Stan Sclaroff,et al.  Sign Language Spotting with a Threshold Model Based on Conditional Random Fields , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[154]  Wei Liu,et al.  A survey on context awareness , 2011, 2011 International Conference on Computer Science and Service System (CSSS).

[155]  Emil M. Petriu,et al.  Hand gesture recognition using Bag-of-features and multi-class Support Vector Machine , 2010, 2010 IEEE International Symposium on Haptic Audio Visual Environments and Games.

[156]  Timo Pylvänäinen,et al.  Accelerometer Based Gesture Recognition Using Continuous HMMs , 2005, IbPRIA.