A self-calibrating algorithm for speaker tracking based on audio-visual statistical models
暂无分享,去创建一个
[1] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[2] Li Deng,et al. A new method for speech denoising and robust speech recognition using probabilistic models for clean speech and for noise , 2001, INTERSPEECH.
[3] Hong Wang,et al. Voice source localization for automatic camera pointing system in videoconferencing , 1997, Proceedings of 1997 Workshop on Applications of Signal Processing to Audio and Acoustics.
[4] M S Brandstein. Time-delay estimation of reverberated speech exploiting harmonic structure. , 1999, The Journal of the Acoustical Society of America.
[5] Brendan J. Frey,et al. Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[6] Brendan J. Frey,et al. Fast, Large-Scale Transformation-Invariant Clustering , 2001, NIPS.
[7] A. Blake,et al. Sequential Monte Carlo fusion of sound and vision for speaker tracking , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.
[8] Patrick Pérez,et al. Sequential Monte Carlo Fusion of Sound and Vision for Speaker Tracking , 2001, ICCV.
[9] Christoph E. Schreiner,et al. Blind source separation and deconvolution: the dynamic component analysis algorithm , 1998 .
[10] Brendan J. Frey,et al. Estimating mixture models of images and inferring spatial transformations using the EM algorithm , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).