A vision based motion interface for mobile phones

In this paper we present an interface system for the control of mobile devices based on motion and using existing camera technology. In this system the user can control the phone's functions by performing a series of motions with the camera and each command is defined by a unique series of these motions. A sequence of motion features is produced using the phone's camera and these characterise the translation motion of the phone. These sequences of motion fea- tures are classified using Hidden Markov Models(HMMs). In order to improve the robustness of the system the results of this classification are then filtered us- ing a likelihood ratio and the entropy of the sequence to reject possibly incorrect sequences. When tested on 570 previously unseen motion sequences the system incorrectly classified only 5 sequences.

[1]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[2]  Tolga K. Çapin,et al.  Mobile Camera-Based User Interaction , 2005, ICCV-HCI.

[3]  Oliver Bimber,et al.  Video see-through AR on consumer cell-phones , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[4]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[5]  Willem Jonker,et al.  Recognizing Strokes in Tennis Videos using Hidden Markov Models , 2001, VIIP.

[6]  Jean-Marc Odobez,et al.  Robust Multiresolution Estimation of Parametric Motion Models , 1995, J. Vis. Commun. Image Represent..

[7]  Janne Heikkilä,et al.  A Vision-Based Approach for Controlling User Interfaces of Mobile Devices , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[8]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[9]  Michael Rohs,et al.  Real-World Interaction with Camera Phones , 2004, UCS.

[10]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[11]  Jean-Marc Odobez,et al.  Multi-modal audio-visual event recognition for football analysis , 2003, 2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718).

[12]  Nicole M. Artner,et al.  Motion Detection as Interaction Technique for Games & Applications on Mobile Devices , 2005, PERMID.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[15]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Olli Silven,et al.  Motion analysis using frame differences with spatial gradient measures , 2004, ICPR 2004.

[17]  Jianying Hu,et al.  HMM Based On-Line Handwriting Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..