Vowel Recognition from RGB-D Facial Information

One of the main concerns in developed countries is population ageing. Elder people are susceptible of suffering conditions which reduce quality of life such as apraxia of speech, a burden that requires prolongued therapy. Our proposal is intended to be a first step towards automated solutions that assist speech therapy through detecting mouth poses. This work proposes a system for vowel poses recognition from an RGB-D camera that provides 2D and 3D information. 2D data is fed into a face recognition approach able to accurately locate and characterize the mouth in the image space. The approach also uses 3D real world measures obtained after pairing the 2D detection with the 3D information. Both information sources are processed by a set of classifiers to ascertain the best option for vowel recognition.

[1]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[2]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[3]  Morgan Quigley,et al.  ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[4]  Fred Nicolls,et al.  Active shape models with SIFT descriptors and MARS , 2015, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[5]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[6]  Serhat Ozkan,et al.  Apraxia for differentiating Alzheimer’s disease from subcortical vascular dementia and mild cognitive impairment , 2013, Neuropsychiatric disease and treatment.

[7]  Elizabeth Gerstner,et al.  A Case of Progressive Apraxia of Speech in Pathologically Verified Alzheimer Disease , 2007, Cognitive and behavioral neurology : official journal of the Society for Behavioral and Cognitive Neurology.

[8]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[10]  P. Marti,et al.  Socially Assistive Robotics in the Treatment of Behavioural and Psychological Symptoms of Dementia , 2006, The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, 2006. BioRob 2006..

[11]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[12]  David L. Olson,et al.  Advanced Data Mining Techniques , 2008 .

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[16]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[17]  Jacek Naruniec Discrete area filters in accurate detection of faces and facial features , 2014, Image Vis. Comput..

[18]  Anil K. Jain,et al.  Handbook of Face Recognition, 2nd Edition , 2011 .