论文信息 - Speech recognition for acoustic-assisted video coding and animation

Speech recognition for acoustic-assisted video coding and animation

In this paper, we discuss issues related to analysis and synthesis of facial images using speech information. An approach to speaker independent acoustic-assisted image coding and animation is studied. A perceptually based sliding window encoder is proposed. It utilizes the high rate (or oversampled) acoustic viseme sequence from the audio domain for image domain viseme interpolation and smoothing. The image domain visemes in our approach are dynamically constructed from a set of basic visemes. The look-ahead and look-back moving interpolations in the proposed approach provide an effective way to compensate the mismatch between auditory and visual perceptions.

Tsuhan Chen | Homer H. Chen | Barry G. Haskell | W. Chou

[1] Alan Jeffrey Goldschen,et al. Continuous automatic speech recognition by lipreading , 1993 .

[2] S. Nishida. Speech recognition enhancement by lip information , 1986, CHI '86.

[3] Kiyoharu Aizawa,et al. Model-based image coding advanced video coding techniques for very low bit-rate applications , 1995, Proc. IEEE.

[4] Eric D. Petajan. Automatic lipreading to enhance speech recognition , 1984 .

[5] Alexander H. Waibel,et al. Improving connected letter recognition by lipreading , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.