Reconnaissance automatique des gestes de la langue française parlée complétée. (Automatic recognition of French Cued Speech gestures)

Le LPC est un complement a la lecture labiale qui facilite la communication des malentendants. Sur le principe, il s'agit d'effectuer des gestes avec une main placee a cote du visage pour desambiguiser le mouvement des levres, qui pris isolement est insuffisant a la comprehension parfaite du message. Le projet RNTS TELMA a pour objectif de mettre en place un terminal telephonique permettant la communication des malentendants en s'appuyant sur le LPC. Parmi les nombreuses fonctionnalites que cela implique, il est necessaire de pouvoir reconnaitre le geste manuel du LPC et de lui associer un sens. L'objet de ce travail est la segmentation video, l'analyse et la reconnaissance des gestes de codeur LPC en situation de communication. Cela fait appel a des techniques de segmentation d'images, de classification, d'interpretation de geste, et de fusion de donnees. Afin de resoudre ce probleme de reconnaissance de gestes, nous avons propose plusieurs algorithmes originaux, parmi lesquels (1) un algorithme base sur la persistance retinienne permettant la categorisation des images de geste cible et des images de geste de transition, (2) une amelioration des methodes de multi-classification par SVM ou par classifieurs unaires via la theorie de l'evidence, assortie d'une methode de conversion des probabilites subjectives en fonction de croyance, et (3) une methode de decision partielle basee sur la generalisation de la Transformee Pignistique, afin d'autoriser les incertitudes dans l'interpretation de gestes ambigus.

[1]  Yong Wang,et al.  SIMPLEST OPERATOR BASED EDGE DETECTION OF BINARY IMAGE , 2004 .

[2]  Lale Akarun,et al.  Cued Speech Hand Shape Recognition , 2007 .

[3]  Guojun Lu,et al.  Evaluation of MPEG-7 shape descriptors against other shape descriptors , 2003, Multimedia Systems.

[4]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[5]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[6]  Silvano Di Zenzo,et al.  A note on the gradient of a multi-image , 1986, Comput. Vis. Graph. Image Process..

[7]  Christine Fernandez-Maloigne,et al.  3D Segmentation of MR Brain Images into White Matter, Gray Matter and Cerebro-Spinal Fluid by Means of Evidence Theory , 2003, AIME.

[8]  J. Kohlas,et al.  A Mathematical Theory of Hints: An Approach to the Dempster-Shafer Theory of Evidence , 1995 .

[9]  Jean Dezert,et al.  The Generalized Pignistic Transformation , 2004, ArXiv.

[10]  L. Shapley A Value for n-person Games , 1988 .

[11]  Laurent Perrinet,et al.  Comment déchiffrer le code impulsionnel de la Vision? Étude du flux parallèle, asynchrone et épars dans le traitement visuel ultra-rapide. , 2003 .

[12]  Prakash P. Shenoy,et al.  On the plausibility transformation method for translating belief function models to probability models , 2006, Int. J. Approx. Reason..

[13]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Glenn Shafer,et al.  Comments on "Constructing a logic of plausible inference: a guide to Cox's Theorem", by Kevin S. Van Horn , 2004, Int. J. Approx. Reason..

[16]  Thierry Denoeux Construction of predictive be-lief functions using a frequentist approach , 2006 .

[17]  Hong Xu,et al.  Reasoning in evidential networks with conditional belief functions , 1996, Int. J. Approx. Reason..

[18]  D. Schmeidler Subjective Probability and Expected Utility without Additivity , 1989 .

[19]  Glenn Shafer,et al.  A Mathematical Theory of Evidence , 2020, A Mathematical Theory of Evidence.

[20]  David Casasent,et al.  Face recognition with pose and illumination variations using new SVRDM support vector machine , 2004, SPIE Defense + Commercial Sensing.

[21]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[22]  Arthur P. Dempster,et al.  A Generalization of Bayesian Inference , 1968, Classic Works of the Dempster-Shafer Theory of Belief Functions.

[23]  P. Smets Decision Making in a Context where Uncertainty is Represented by Belief Functions , 2002 .

[24]  A. Krzyżak,et al.  Methods of CombiningMultiple Classifiers and Their Application to Handwriting Recognition , 1992 .

[25]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[26]  Jean-Marc Nigro,et al.  IDRES: A rule-based system for driving situation recognition with uncertainty management , 2003, Inf. Fusion.

[27]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Dimitris N. Metaxas,et al.  Parallel hidden Markov models for American sign language recognition , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[29]  Philippe Smets,et al.  Decision making in the TBM: the necessity of the pignistic transformation , 2005, Int. J. Approx. Reason..

[30]  Farzin Mokhtarian,et al.  A Theory of Multiscale, Curvature-Based Shape Representation for Planar Curves , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[32]  Jérôme Martin Reconnaissance de gestes en vision par ordinateur , 2000 .

[33]  Milan Daniel Probabilistic Transformations of Belief Functions , 2005, ECSQARU.

[34]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[35]  Venu Govindaraju,et al.  Half-Against-Half Multi-class Support Vector Machines , 2005, Multiple Classifier Systems.

[36]  Marek Kretowski,et al.  Toward a better understanding of texture in vascular CT scan simulated images , 2001, IEEE Transactions on Biomedical Engineering.

[37]  Eli Hagen,et al.  Towards an American Sign Language interface , 1994, Artificial Intelligence Review.

[38]  Denis Pellerin,et al.  Forward-Backward-Viterbi Procedures in the Transferable Belief Model for State Sequence Analysis Using Belief Functions , 2007, ECSQARU.

[39]  Wen Gao,et al.  Large-Vocabulary Continuous Sign Language Recognition Based on Transition-Movement Models , 2007, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[40]  Munib Qutaishat,et al.  American sign language (ASL) recognition based on Hough transform and neural networks , 2007, Expert Syst. Appl..

[41]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[42]  Peter Gejgus,et al.  Skin color segmentation method based on mixture of gaussians and its application in learning system for finger alphabet , 2004, CompSysTech '04.

[43]  Tim Morris,et al.  Hand Segmentation from Live Video , 2002 .

[44]  Ara V. Nefian,et al.  Audio-visual continuous speech recognition using a coupled hidden Markov model , 2002, INTERSPEECH.

[45]  Carl G. Looney,et al.  Fast connected component labeling algorithm using a divide and conquer technique , 2000, CATA.

[46]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[47]  Brendan J. Frey,et al.  Event-coupled hidden Markov models , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[48]  M. Rodrigues Invariants for pattern recognition and classification , 2000 .

[49]  Thierry Denoeux,et al.  Modeling vague beliefs using fuzzy-valued belief structures , 2000, Fuzzy Sets Syst..

[50]  Fabrice Janez,et al.  Theory of evidence and non-exhaustive frames of discernment: Plausibilities correction methods , 1998, Int. J. Approx. Reason..

[51]  A. Benoit Le système visuel humain au secours de la vision par ordinateur , 2007 .

[52]  Jean Dezert,et al.  An Introduction to the DSm Theory for the Combination of Paradoxical, Uncertain, and Imprecise Sources of Information , 2006, ArXiv.

[53]  Christophe Chesnaud Techniques statistiques de segmentation par contour actif et mise en oeuvre rapide , 2000 .

[54]  Thomas Burger,et al.  Extracting Static Hand Gestures in Dynamic Context , 2006, 2006 International Conference on Image Processing.

[55]  Fabio Roli,et al.  Support Vector Machines with Embedded Reject Option , 2002, SVM.

[56]  G. Choquet,et al.  Forme abstraite du théorème de capacitabilité , 1959 .

[57]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[58]  Christine Fernandez-Maloigne,et al.  Evidential Clustering Algorithm for Color Quantization , 2005 .

[59]  Thomas Burger,et al.  Modeling Hesitation and Conflict: A Belief-Based Approach for Multi-class Problems , 2006, 2006 5th International Conference on Machine Learning and Applications (ICMLA'06).

[60]  Philippe Smets,et al.  The Transferable Belief Model , 1991, Artif. Intell..

[61]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[62]  Patrick Gallinari,et al.  Online Handwritten Shape Recognition Using Segmental Hidden Markov Models , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[64]  Alan F. Smeaton,et al.  Multispectral Object Segmentation and Retrieval in Surveillance Video , 2006, 2006 International Conference on Image Processing.

[65]  Thierry Denoeux,et al.  Approximating the combination of belief functions using the fast Mo"bius transform in a coarsened frame , 2002, Int. J. Approx. Reason..

[66]  Daniel Trevisan Bravo,et al.  2D images calibration to facial features extraction , 2007, GRAPP.

[67]  Thomas Burger,et al.  Cued speech hand gestures recognition tool , 2005, 2005 13th European Signal Processing Conference.

[68]  J. Dréo,et al.  Métaheuristiques pour l'optimisation difficile , 2003 .

[69]  Thierry Denoeux,et al.  Analysis of evidence-theoretic decision rules for pattern classification , 1997, Pattern Recognit..

[70]  Christophe Garcia,et al.  Embedded Convolutional Face Finder , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[71]  Stan Sclaroff,et al.  Automatic detection of relevant head gestures in American Sign Language communication , 2002, Object recognition supported by user interaction for service robots.

[72]  Jordi Robert-Ribes Modèles d'intégration audiovisuelle de signaux linguistiques : de la perception humaine a la reconnaissance automatique des voyelles , 1995 .

[73]  J. Franke,et al.  A comparison of two approaches for combining the votes of cooperating classifiers , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[74]  荒木 望 Unscented Kalman Filterの計測への応用に関する研究 , 2007 .

[75]  Nikos Papamarkos,et al.  Adaptive document binarization - a human vision approach , 2007, VISAPP.

[76]  Philippe Smets No Dutch Book can be built against the TBM even though update is not obtained by Bayes rule of conditioning , 1999 .

[77]  Tomonori Kikuchi Error Correcting Output Codes vs . Fuzzy Support Vector Machines , 2003 .

[78]  R. E. Kalman,et al.  New Results in Linear Filtering and Prediction Theory , 1961 .

[79]  Alfred O. Hero,et al.  Dual Rooted-Diffusions for Clustering and Classification on Manifolds , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[80]  Prakash P. Shenoy,et al.  Axioms for probability and belief-function proagation , 1990, UAI.

[81]  Michal Irani,et al.  Computing occluding and transparent motions , 1994, International Journal of Computer Vision.

[82]  Qinghua Hu,et al.  Fuzzy Output Support Vector Machines for Classification , 2005, ICNC.

[83]  Oliver Schreer,et al.  Vision-based skin-colour segmentation of moving hands for real-time applications , 2004 .

[84]  Jean-Marc Ogier,et al.  DocMining: A Document Analysis System Builder , 2004, Document Analysis Systems.

[85]  Surendra Ranganath,et al.  Automatic Sign Language Analysis: A Survey and the Future beyond Lexical Meaning , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[86]  Prakash P. Shenoy,et al.  A Comparison of Methods for Transforming Belief Function Models to Probability Models , 2003, ECSQARU.

[87]  Lale Akarun,et al.  Sign Language Tutoring tool , 2005, 2005 13th European Signal Processing Conference.

[88]  S. Adam,et al.  2 - Utilisation de la transformée de Fourier-Mellin pour la reconnaissance de formes multi-orientées et multi-échelles : application à l'analyse automatique de documents techniques , 2001 .

[89]  Patrick Vannoorenberghe,et al.  Partially Supervised Learning by a Credal EM Approach , 2005, ECSQARU.

[90]  Min C. Shin,et al.  Effect of colorspace transformation, the illuminance component, and color modeling on skin detection , 2004, CVPR 2004.

[91]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[92]  Fabio Gagliardi Cozman,et al.  Credal networks , 2000, Artif. Intell..

[93]  Thomas Burger,et al.  Cued Speech Gesture Recognition: A First Prototype Based on Early Reduction , 2007, EURASIP J. Image Video Process..

[94]  Lale Akarun,et al.  Recognizing Two Handed Gestures with Generative, Discriminative and Ensemble Methods Via Fisher Kernels , 2006, MRCS.

[95]  Gérard Bailly,et al.  TELMA : Telephony for the Hearing-Impaired People. From Models to User Tests , 2007 .

[96]  Surendra Ranganath,et al.  Representations for facial expressions , 2002, 7th International Conference on Control, Automation, Robotics and Vision, 2002. ICARCV 2002..

[97]  Lale Akarun,et al.  Sequential Belief-Based Fusion of Manual and Non-Manual Signs , 2007 .

[98]  Richard W. Conners,et al.  A Theoretical Comparison of Texture Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[99]  Rolf Haenni,et al.  Uncover Dempster's Rule Where It Is Hidden , 2006, 2006 9th International Conference on Information Fusion.

[100]  Nianjun Liu,et al.  Model structure selection & training algorithms for an HMM gesture recognition system , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.

[101]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[102]  Thomas Burger,et al.  Characterizing and classifying cued speech vowels from labial parameters , 2004, INTERSPEECH.

[103]  Philippe Smets,et al.  Belief functions: The disjunctive rule of combination and the generalized Bayesian theorem , 1993, Int. J. Approx. Reason..

[104]  Martin Russell,et al.  A segmental HMM for speech pattern modelling , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[105]  Arie Tzvieli Possibility theory: An approach to computerized processing of uncertainty , 1990, J. Am. Soc. Inf. Sci..

[106]  Paul Teller,et al.  Conditionalization and observation , 1973, Synthese.

[107]  Robyn A. Owens,et al.  Recognising moving hand shapes , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[108]  Emanuele Trucco,et al.  Geometric Invariance in Computer Vision , 1995 .

[109]  Thierry Denoeux,et al.  A neural network classifier based on Dempster-Shafer theory , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[110]  Jeff A. Bilmes,et al.  What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..

[111]  J. J. Sudano Inverse pignistic probability transforms , 2002, Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997).

[112]  Florent Perronnin A probabilistic model of face mapping applied to person recognition , 2004 .