Toward a tongue-based task triggering interface for computer interaction

A system able to detect the existence of the tongue and locate its relative position within the surface of the mouth by using video information obtained from a web camera is proposed in this paper. The system consists of an offline phase, prior to the the operation by the final user, in which a 3-layer cascade of SVM learning classifiers are trained using a database of 'tongue vs. not-tongue' images, that correspond to segmented images containing our region of interest, the mouth with the tongue in three possible positions: center, left or right. The first training stage discerns whether the tongue is present or not, giving the output data to the next stage, in which the presence of the tongue in the center of the mouth is evaluated; finally, in the last stage, a left vs. right position detection is assessed. Due to the novelty of the proposed system, a database needed to be created by using information gathered from different people of distinct ethnic backgrounds. While the system has yet to be tested in an online stage, results obtained from the offline phase show that it is feasible to achieve a real-time performance in the near future. Finally, diverse applications to this prototype system are introduced, demonstrating that the tongue can be effectively used as an alternative input device by a broad range of users, including people with some physical disability condition.

[1]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Mark A. Clements,et al.  Automatic Speechreading with Applications to Human-Computer Interfaces , 2002, EURASIP J. Adv. Signal Process..

[3]  Yrjö Neuvo,et al.  A New Class of Detail-Preserving Filters for Image Processing , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  James W. Davis,et al.  A perceptual user interface for recognizing head gesture acknowledgements , 2001, PUI '01.

[5]  M. Betke,et al.  The Camera Mouse: visual tracking of body features to provide computer access for people with severe disabilities , 2002, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[6]  Wen-Shu Li,et al.  Tracking of dynamic tongue in traditional Chinese medicine , 2005, 2005 International Conference on Machine Learning and Cybernetics.

[7]  Margrit Betke,et al.  Evaluation of tracking methods for human-computer interaction , 2002, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings..

[8]  Thomas S. Huang,et al.  Face as mouse through visual face tracking , 2007, Comput. Vis. Image Underst..

[9]  Dmitry O. Gorodnichy,et al.  Nouse 'use your nose as a mouse' perceptual vision technology for hands-free games and interfaces , 2004, Image Vis. Comput..

[10]  Ali Farhadi,et al.  How to tell the difference between a cat and a dog? , 2006, Int. J. Imaging Syst. Technol..

[11]  Chalapathy Neti,et al.  Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.

[12]  Matthew Turk,et al.  Perceptual user interfaces , 2000 .

[13]  A. Andreeva,et al.  A Comparative Assessment of Pixel-based Skin Detection Methods , 2006 .

[14]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[15]  Margrit Betke,et al.  Communication via eye blinks and eyebrow raises: video-based human-computer interfaces , 2003, Universal Access in the Information Society.

[16]  Quentin Stafford-Fraser,et al.  On site: The life and times of the first Web Cam , 2001, CACM.