Multi-speaker voice activity detection using a camera-assisted microphone array
暂无分享,去创建一个
[1] Paul A. Viola,et al. Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.
[2] Gunnar Farnebäck,et al. Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.
[3] Richard M. Dansereau,et al. Robust joint audio-video localization in video conferencing using reliability information , 2004, IEEE Transactions on Instrumentation and Measurement.
[4] Hong Liu,et al. Improved Voice Activity Detection based on support vector machine with high separable speech feature vectors , 2014, 2014 19th International Conference on Digital Signal Processing.
[5] Masakiyo Fujimoto,et al. A realtime multimodal system for analyzing group meetings by combining face pose tracking and speaker diarization , 2008, ICMI '08.
[6] M. Picheny,et al. Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences , 2017 .
[7] Rafik A. Goubran,et al. Robust voice activity detection using higher-order statistics in the LPC residual domain , 2001, IEEE Trans. Speech Audio Process..
[8] Joseph H. DiBiase. A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays , 2000 .
[9] S. Furui,et al. Speaker-independent isolated word recognition based on emphasized spectral dynamics , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[10] Andreas Stolcke,et al. Multispeaker speech activity detection for the ICSI meeting recorder , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[11] S. Gökhun Tanyer,et al. Voice activity detection in nonstationary noise , 2000, IEEE Trans. Speech Audio Process..
[12] Giacomo Aletti,et al. Robust DOA estimation of speech signals via sparsity models using microphone arrays , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[13] Christian Jutten,et al. An Analysis of Visual Speech Information Applied to Voice Activity Detection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[14] Javier Ramírez,et al. Efficient voice activity detection algorithms using long-term speech information , 2004, Speech Commun..
[15] Chuohao Yeo,et al. Multi-modal speaker diarization of real-world meetings using compressed-domain video features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Anthony G. Constantinides,et al. Audio–Visual Active Speaker Tracking in Cluttered Indoors Environments , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[17] P. Fränti,et al. Voice Activity Detection Using MFCC Features and Support Vector Machine , 2007 .
[18] Kah Phooi Seng,et al. Improved voice activity detection for speech recognition system , 2010, 2010 International Computer Symposium (ICS2010).
[19] Friedrich Faubel,et al. Improving hands-free speech recognition in a car through audio-visual voice activity detection , 2011, 2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays.
[20] H.K. Ekenel,et al. Kalman filters for audio-video source localization , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..
[21] Ben P. Milner,et al. Using audio-visual features for robust voice activity detection in clean and noisy speech , 2008, 2008 16th European Signal Processing Conference.
[22] Peng Liu,et al. Voice activity detection using visual information , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[23] Gautham J. Mysore,et al. Speaker and noise independent voice activity detection , 2013, INTERSPEECH.
[24] G. Carter,et al. The generalized correlation method for estimation of time delay , 1976 .
[25] Cláudio Rosito Jung,et al. Multimodal Multi-Channel On-Line Speaker Diarization Using Sensor Fusion Through SVM , 2015, IEEE Transactions on Multimedia.
[26] Hongzhi Wang,et al. Study on the MFCC similarity-based voice activity detection algorithm , 2011, 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC).
[27] Birger Kollmeier,et al. Speech pause detection for noise spectrum estimation by tracking power envelope dynamics , 2002, IEEE Trans. Speech Audio Process..
[28] Ines Hafizovic,et al. Design and implementation of a MEMS microphone array system for real-time speech acquisition , 2012 .
[29] Carlo Tomasi,et al. Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[30] Ji Wu,et al. Efficient Multiple Kernel Support Vector Machine Based Voice Activity Detection , 2011, IEEE Signal Processing Letters.
[31] Sven Nordholm,et al. Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[32] Patrick Bauer,et al. A Particle Filtering Algorithm for Audiovisual Speaker Localisation , 2007, 2007 4th Workshop on Positioning, Navigation and Communication.