Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface
暂无分享,去创建一个
Naoyuki Ichimura | Futoshi Asano | Jun Ogata | Takashi Yoshimura | Kiyoshi Yamamoto | Hideki Asoh | Isao Hara | Yoichi Motomura
[1] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[2] Finn V. Jensen,et al. Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.
[3] Andrew D. Christian,et al. Digital smart kiosk project , 1998, CHI.
[4] Satoshi Nakamura,et al. Real time face detection for multimodal speech recognition , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.
[5] R. O. Schmidt,et al. Multiple emitter location and signal Parameter estimation , 1986 .
[6] Jae S. Lim,et al. Speech enhancement , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[7] Robert C. Bolles,et al. Background modeling for segmentation of video-rate stereo sequences , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).
[8] Thomas Kailath,et al. Detection of signals by information theoretic criteria , 1985, IEEE Trans. Acoust. Speech Signal Process..
[9] Nobuaki Minematsu,et al. IPA Japanese Dictation Free Software Project , 2000, LREC.
[10] Vladimir Pavlovic,et al. Boosted learning in dynamic Bayesian networks for multimodal detection , 2002, Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997).
[11] Futoshi Asano,et al. Fusion of audio and video information for detecting speech events , 2003, Sixth International Conference of Information Fusion, 2003. Proceedings of the.
[12] Satoshi Nakamura,et al. Detection and separation of speech segment using audio and video information fusion , 2003, INTERSPEECH.
[13] Takeshi Yamada,et al. Estimation of the number of sound sources using support vector machines and its application to sound source separation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[14] Christophe Beaugeant,et al. METHODOLOGY FOR THE DESIGN OF A ROBUST VOICE ACTIVITY DETECTOR FOR SPEECH ENHANCEMENT , 2003 .
[15] Satoshi Nakamura,et al. DETECTION OF SPEECH EVENTS IN REAL ENVIRONMENTS THROUGH FUSION OF AUDIO AND VIDEO INFORMATION USING BAYESIAN NETWORKS , 2003 .
[16] Peter Beyerlein,et al. Speaker adaptation in the Philips system for large vocabulary continuous speech recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[17] Yasuo Ariki,et al. Unsupervised acoustic model adaptation based on phoneme error minimization , 2002, INTERSPEECH.
[18] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[19] Satoshi Nakamura,et al. Speech enhancement based on the subspace method , 2000, IEEE Trans. Speech Audio Process..