Design and Evaluation of a Presentation Maestro: Controlling Electronic Presentations Through Gesture

Gesture-based interaction has long been seen as a natural means of input for electronic presentation systems; however, gesture-based presentation systems have not been evaluated in real-world contexts, and the implications of this interaction modality are not known. This thesis describes the design and evaluation of Maestro, a gesture-based presentation system which was developed to explore these issues. This work is presented in two parts. The first part describes Maestro’s design, which was informed by a small observational study of people giving talks; and Maestro’s evaluation, which involved a two week field study where Maestro was used for lecturing to a class of approximately 100 students. The observational study revealed that presenters regularly gesture towards the content of their slides. As such, Maestro supports several gestures which operate directly on slide content (e.g., pointing to a bullet causes it to be highlighted). The field study confirmed that audience members value these content-centric gestures. Conversely, the use of gestures for navigating slides is perceived to be less efficient than the use of a remote. Additionally, gestural input was found to result in a number of unexpected side effects which may hamper the presenter’s ability to fully engage the audience. The second part of the thesis presents a gesture recognizer based on discrete hidden Markov models (DHMMs). Here, the contributions lie in presenting a feature set and a factorization of the standard DHMM observation distribution, which allows modeling of a wide range of gestures (e.g., both one-handed and bimanual gestures), but which uses few modeling parameters. To establish the overall robustness and accuracy of the recognition system, five new users and one expert were asked to perform ten instances of each gesture. The system accurately recognized 85% of gestures for new users, increasing to 96% for the expert user. In both cases, false positives accounted for fewer than 4% of all detections. These error rates compare favourably to those of similar systems.

[1]  Michel Beaudouin-Lafon,et al.  Charade: remote control of objects using free-hand gestures , 1993, CACM.

[2]  John S. Boreczky,et al.  Manipulating and Annotating Slides in a Multi-Display Environment , 2003, INTERACT.

[3]  Gerhard Rigoll,et al.  High Performance Real-Time Gesture Recognition Using Hidden Markov Models , 1997, Gesture Workshop.

[4]  Xiang Cao,et al.  Evaluation of alternative presentation control techniques , 2005, CHI EA '05.

[5]  I. M. Parker,et al.  Absolute PowerPoint: Can a software package edit our thoughts , 2001 .

[6]  Jeremy R. Cooperstock,et al.  Occlusion Detection for Front-Projected Interactive Displays , 2004 .

[7]  Anthony Stefanidis,et al.  3D trajectory matching by pose normalization , 2005, GIS '05.

[8]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Michael J. Black,et al.  Analysis of gesture and action in technical talks for video indexing , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Edward R. Tufte,et al.  The Cognitive Style of PowerPoint: Pitching Out Corrupts Within , 2003 .

[11]  David Salesin,et al.  On creating animated presentations , 2003, SCA '03.

[12]  Sheryl R. Young,et al.  Recognition Confidence Measures: Detection of Misrecognitions and Out- Of-Vocabulary Words , 1994 .

[13]  Heinrich Müller,et al.  Interaction with a projection screen using a camera-tracked laser pointer , 1998, Proceedings 1998 MultiMedia Modeling. MMM'98 (Cat. No.98EX200).

[14]  François Bérard,et al.  Bare-hand human-computer interaction , 2001, PUI '01.

[15]  Biing-Hwang Juang,et al.  The use of cohort normalized scores for speaker verification , 1992, ICSLP.

[16]  Anthony Tang,et al.  Shadow reaching: a new perspective on interaction for large displays , 2007, UIST.

[17]  Narendra Ahuja,et al.  Extraction of 2D Motion Trajectories and Its Application to Hand Gesture Recognition , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Richard Rose,et al.  A hidden Markov model based keyword recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[19]  Louis Boves,et al.  Weighting phone confidence measures for automatic speech recognition , 2000 .

[20]  T.P. Mann Numerically Stable Hidden Markov Model Implementation , 2006 .

[21]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Allan D. Jepson,et al.  Non-accidental Features in Learning , 1993 .

[23]  Lijun Tang,et al.  A portable system for anywhere interactions , 2004, CHI EA '04.

[24]  Jun-ichi Takahashi,et al.  A new cohort normalization using local acoustic information for speaker verification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[25]  Chin-Hui Lee,et al.  Automatic recognition of keywords in unconstrained speech using hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[26]  M. A. Bush,et al.  Training and search algorithms for an interactive wordspotting system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  Desney S. Tan,et al.  Pre-emptive shadows: eliminating the blinding light from projectors , 2002, CHI Extended Abstracts.

[28]  Antonio M. Peinado,et al.  Improvements in HMM-based isolated word recognition system , 1991 .

[29]  Keechul Jung,et al.  Recognition-based gesture spotting in video games , 2004, Pattern Recognit. Lett..

[30]  Dan R. Olsen,et al.  Laser pointer interaction , 2001, CHI.

[31]  Timothy J. Hazen,et al.  Word and phone level acoustic confidence scoring , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[32]  Daijin Kim,et al.  Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[33]  Yangsheng Xu,et al.  Hidden Markov Model for Gesture Recognition , 1994 .

[34]  Maria Karam,et al.  A framework for research and design of gesture-based human-computer interactions , 2006 .

[35]  Rashid Ansari,et al.  Multimodal human discourse: gesture and speech , 2002, TCHI.

[36]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[37]  Jin-Hyung Kim,et al.  An HMM-Based Threshold Model Approach for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[38]  Rahul Sukthankar,et al.  Smarter Presentations: Exploiting Homography in Camera-Projector Systems , 2001, ICCV.

[39]  S. Mitra,et al.  Gesture Recognition: A Survey , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[40]  Ho-Sub Yoon,et al.  Hand gesture recognition using combined features of location, angle and velocity , 2001, Pattern Recognit..

[41]  Richard A. Tennant,et al.  The American Sign Language Handshape Dictionary , 1998 .

[42]  D. McNeill Gesture and Thought , 2005 .

[43]  C. Creider Hand and Mind: What Gestures Reveal about Thought , 1994 .

[44]  Rahul Sukthankar,et al.  Self-Calibrating Camera-Assisted Presentation Interface , 2000 .

[45]  Alan Wexelblat Research Challenges in Gesture: Open Issues and Unsolved Problems , 1997, Gesture Workshop.

[46]  Carey Jewitt,et al.  Interactive Whiteboards, Pedagogy, and Pupil Performance: An Evaluation of the Schools Whiteboard Expansion Project (London Challenge) , 2007 .

[47]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Hui Jiang,et al.  Confidence measures for speech recognition: A survey , 2005, Speech Commun..

[49]  Kelvin Cheng,et al.  Direct Interaction with Large-Scale Display Systems using Infrared Laser tracking Devices , 2003, InVis.au.

[50]  Seiichi Uchida,et al.  An HMM implementation for on-line handwriting recognition based on pen-coordinate feature and pen-direction feature , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[51]  Allan D. Jepson,et al.  What Makes a Good Feature , 1992 .

[52]  Steve Higgins,et al.  Interactive whiteboards: boon or bandwagon? A critical review of the literature , 2005, J. Comput. Assist. Learn..

[53]  Tamás Szirányi,et al.  Hand Gesture Recognition in Camera-Projector System , 2004, ECCV Workshop on HCI.

[54]  Michael J. Black,et al.  Recognizing temporal trajectories using the condensation algorithm , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[55]  Joel Lanir,et al.  MultiPresenter: a presentation system for (very) large display surfaces , 2008, ACM Multimedia.

[56]  Adam Fourney,et al.  Non-Accidental Features for Gesture Spotting , 2009, 2009 Canadian Conference on Computer and Robot Vision.

[57]  Lia Adams,et al.  Palette: a paper interface for giving presentations , 1999, CHI '99.

[58]  Gerhard Rigoll,et al.  Hidden Markov model based continuous online gesture recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[59]  Chin-Hui Lee,et al.  Application of hidden Markov models for recognition of a limited set of words in unconstrained speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.