Speech recognition for image animation and coding

We discuss some issues related to acoustic assisted image coding and animation. An approach of talker independent acoustic assisted image coding and animation scheme is studied. A perceptually based sliding window encoder is proposed. It utilizes the high rate (or oversampled) viseme sequence from the audio domain for image domain viseme interpolation and smoothing. The image domain visemes in our approach are dynamically constructed from a set of basic visemes. The look-ahead and look-back moving interpolations in the proposed approach provide an effective way to compensate the mismatch between auditory and visual perceptions.

[1]  Aaron E. Rosenberg,et al.  Improved acoustic modeling for large vocabulary continuous speech recognition , 1992 .

[2]  Alexander H. Waibel,et al.  Improving connected letter recognition by lipreading , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Aaron E. Rosenberg,et al.  Improved acoustic modeling for speaker independent large vocabulary continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4]  S. Nishida Speech recognition enhancement by lip information , 1986, CHI '86.

[5]  Robert Forchheimer,et al.  Image coding-from waveforms in animation , 1989, IEEE Trans. Acoust. Speech Signal Process..

[6]  Kiyoharu Aizawa,et al.  An intelligent facial image coding driven by speech and phoneme , 1989, International Conference on Acoustics, Speech, and Signal Processing,.