Time-delay neural networks for estimating lip movements from speech analysis: a useful tool in audio-video synchronization
暂无分享,去创建一个
[1] Waveforms Hisashi Wakita. Direct Estimation of the Vocal Tract Shape by Inverse Filtering of Acoustic Speech , 1973 .
[2] N. P. Erber,et al. Auditory, visual, and auditory-visual recognition of consonants by children with normal and impaired hearing. , 1972, Journal of speech and hearing research.
[3] Tsuhan Chen,et al. A new frame interpolation scheme for talking head sequences , 1995, Proceedings., International Conference on Image Processing.
[4] Osamu Fujimura. Elementary Gestures and Temporal Organization — What Does an Articulatory Constraint Mean? , 1981 .
[5] Q Summerfield,et al. Use of Visual Information for Phonetic Perception , 1979, Phonetica.
[6] Fabio Lavagetto,et al. Object-oriented scene modeling for interpersonal video communication at very low bit-rate , 1994, Signal Process. Image Commun..
[7] F. Lavagetto,et al. Converting speech into lip movements: a multimedia telephone for hard of hearing people , 1995 .
[8] Fabio Lavagetto,et al. MPEG-4: Audio/video and synthetic graphics/audio for mixed media , 1997, Signal Process. Image Commun..
[9] B.P. Yuhas,et al. Integration of acoustic and visual speech signals using neural networks , 1989, IEEE Communications Magazine.
[10] Eric A. Wan,et al. Temporal backpropagation for FIR neural networks , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[11] Hiroshi Harashima,et al. A Media Conversion from Speech to Facial Image for Intelligent Man-Machine Interface , 1991, IEEE J. Sel. Areas Commun..
[12] H.P. Graf,et al. Lip synchronization using speech-assisted video processing , 1995, IEEE Signal Processing Letters.
[13] Eric D. Petajan,et al. MPEG-4 : Audio / Video & Synthetic Graphics / Audio for Mixed Media , 1998 .
[14] Yao Wang,et al. Speech-assisted lip synchronization in audio-visual communications , 1995, Proceedings., International Conference on Image Processing.
[15] Carol A. Fowler,et al. Coarticulation and theories of extrinsic timing , 1980 .
[16] Geoffrey E. Hinton,et al. A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.
[17] F. Lavagetto,et al. Time Delay Neural Networks for Articulatory Estimation from Speech: Suitable Subjective Evaluation Protocols , 1996 .
[18] Raymond D. Kent,et al. Coarticulation in recent speech production models , 1977 .
[19] E. Owens,et al. Visemes observed by hearing-impaired and normal-hearing adult viewers. , 1985, Journal of speech and hearing research.
[20] M. Pichora-Fuller,et al. Coarticulation effects in lipreading. , 1982, Journal of speech and hearing research.
[21] Tsuhan Chen,et al. Audio visual interaction in multimedia , 1995 .
[22] Fabio Lavagetto,et al. Synthetic and hybrid imaging in the HUMANOID and VIDAS projects , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.
[23] A. D. Brink,et al. Minimum cross-entropy threshold selection , 1996, Pattern Recognit..
[24] Oscar N. Garcia,et al. Rationale for Phoneme-Viseme Mapping and Feature Selection in Visual Speech Recognition , 1996 .
[25] Geoffrey E. Hinton,et al. Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..
[26] Tsuhan Chen,et al. Speech-assisted video processing: interpolation and low-bitrate coding , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.
[27] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[28] Guy Mercier,et al. Neural-fuzzy networks and phonetic feature recognition as a help for speechreading , 1996 .
[29] David G. Stork,et al. Speechreading by Humans and Machines , 1996 .
[30] Waibel. A novel objective function for improved phoneme recognition using time delay neural networks , 1989 .
[31] R. Hammarberg. The metaphysics of coarticulation , 1976 .
[32] Fabio Lavagetto. SPEECH ASSISTED MOTION COMPENSATION IN VIDEOPHONE COMMUNICATIONS , 1996 .