Lip reading from scale-space measurements

Systems that attempt to recover the spoken word from image sequences usually require complicated models of the mouth and its motions. Here we describe a new approach based on a fast mathematical morphology transform called the sieve. We form statistics of scale measurements in one and two dimensions and these are used as a feature vector for standard Hidden Markov Models (HMMs).

[1]  N. Michael Brooke,et al.  Using the visual component in automatic speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[2]  Timothy F. Cootes,et al.  Use of active shape models for locating structures in medical images , 1994, Image Vis. Comput..

[3]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[4]  Timothy F. Cootes,et al.  A unified approach to coding and interpreting face images , 1995, Proceedings of IEEE International Conference on Computer Vision.

[5]  Juergen Luettin,et al.  Visual speech recognition using active shape models and hidden Markov models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[6]  A. Adjoudani,et al.  On the Integration of Auditory and Visual Parameters in an HMM-based ASR , 1996 .

[7]  Stephen Cox,et al.  Scale based features for audiovisual speech recognition , 1996 .

[8]  J. Andrew Bangham,et al.  Scale-space from nonlinear filters , 1995, Proceedings of IEEE International Conference on Computer Vision.

[9]  Alexander Toet,et al.  Graph morphology , 1992, J. Vis. Commun. Image Represent..

[10]  J. Andrew Bangham,et al.  Multiscale recursive medians, scale-space, and transforms with applications to image processing , 1996, IEEE Trans. Image Process..

[11]  Peter L. Silsbee,et al.  Audiovisual Sensory Integration Using Hidden Markov Models , 1996 .

[12]  Jean-Luc Schwartz,et al.  Exploiting sensor fusion architectures and stimuli complementarity in AV speech recognition , 1996 .

[13]  J. Andrew Bangham,et al.  Scale-Space Filters and Their Robustness , 1997, Scale-Space.

[14]  Pierre Chardaire,et al.  Multiscale Nonlinear Decomposition: The Sieve Decomposition Theorem , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Martin J. Russell,et al.  Integrating audio and visual information to provide highly robust speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[16]  Andrew Blake,et al.  Real-Time Lip Tracking for Audio-Visual Speech Recognition Applications , 1996, ECCV.

[17]  Andrew P. Witkin,et al.  Scale-Space Filtering , 1983, IJCAI.

[18]  Stephen J. Cox,et al.  Audiovisual speech recognition using multiscale nonlinear image decomposition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[19]  Peter L. Silsbee,et al.  A multiple deformable template approach for visual speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[20]  平山亮 会議報告-Speechreading by Humans and Machines; Models Systems and Applications , 1997 .

[21]  Timothy F. Cootes,et al.  The Use of Active Shape Models for Locating Structures in Medical Images , 1993, IPMI.

[22]  David G. Stork,et al.  Visionary Speech: Looking Ahead to Practical Speechreading Systems , 1996 .

[23]  J. Andrew Bangham,et al.  Morphological scale-space preserving transforms in many dimensions , 1996, J. Electronic Imaging.

[24]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[25]  Yochai Konig,et al.  A hybrid approach to bimodal speech recognition , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[26]  J. Andrew Bangham,et al.  Scale-Space From Nonlinear Filters , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Bart M. ter Haar Romeny,et al.  Geometry-Driven Diffusion in Computer Vision , 1994, Computational Imaging and Vision.

[28]  Tony Lindeberg,et al.  Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.