Audiovisual Voice Activity Detection Based on Microphone Arrays and Color Information
暂无分享,去创建一个
Jacob Scharcanski | Carlos B. O. Lopes | Cláudio Rosito Jung | Bowon Lee | Vicente P. Minotto | C. Jung | J. Scharcanski | Bowon Lee
[1] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[2] R. Boostani,et al. Lip segmentation in color images , 2008, 2008 International Conference on Innovations in Information Technology.
[3] Ton Kalker,et al. Voice activity detection and speaker localization using audiovisual cues , 2012, Pattern Recognit. Lett..
[4] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[5] Wen Gao,et al. Face detection and location based on skin chrominance and lip chrominance transformation from color images , 2001, Pattern Recognit..
[6] Yoav Freund,et al. The Alternating Decision Tree Learning Algorithm , 1999, ICML.
[7] Jing Xu,et al. Lip Detection and Tracking Using Variance Based Haar-Like Features and Kalman filter , 2010, 2010 Fifth International Conference on Frontier of Computer Science and Technology.
[8] Joseph H. DiBiase. A High-Accuracy, Low-Latency Technique for Talker Localization in Reverberant Environments Using Microphone Arrays , 2000 .
[9] Christian Jutten,et al. An Analysis of Visual Speech Information Applied to Voice Activity Detection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[10] Christian Jutten,et al. Two novel visual voice activity detectors based on appearance models and retinal filtering , 2007, 2007 15th European Signal Processing Conference.
[11] Satoshi Tamura,et al. Voice activity detection based on fusion of audio and visual information , 2009, AVSP.
[12] Mark Hasegawa-Johnson,et al. Estimation of High-Variance Vehicular Noise , 2009 .
[13] Aiko M. Hormann,et al. Programs for Machine Learning. Part I , 1962, Inf. Control..
[14] Jean-Marc Odobez,et al. Audiovisual Probabilistic Tracking of Multiple Speakers in Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Michael S. Brandstein,et al. Robust automatic video-conferencing with multiple cameras and microphones , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[16] Yoav Freund,et al. Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.
[17] Soo Ngee Koh,et al. Improved noise suppression filter using self-adaptive estimator of probability of speech absence , 1999, Signal Process..
[18] John C. Platt,et al. Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .
[19] Amir Said,et al. Feature-Based Face Tracking for Videoconferencing Applications , 2009, 2009 11th IEEE International Symposium on Multimedia.
[20] João Gama,et al. Functional Trees , 2001, Machine Learning.
[21] Anthony G. Constantinides,et al. Audio–Visual Active Speaker Tracking in Cluttered Indoors Environments , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[22] Robert C. Holte,et al. Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.
[23] Gerald Schaefer,et al. Illuminant and device invariant colour using histogram equalisation , 2005, Pattern Recognit..
[24] Eibe Frank,et al. Combining Naive Bayes and Decision Tables , 2008, FLAIRS.
[25] Ian H. Witten,et al. The WEKA data mining software: an update , 2009, SKDD.
[26] Ben P. Milner,et al. Using audio-visual features for robust voice activity detection in clean and noisy speech , 2008, 2008 16th European Signal Processing Conference.
[27] Ron Kohavi,et al. The Power of Decision Tables , 1995, ECML.
[28] Michael S. Brandstein,et al. A hybrid real-time face tracking system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[29] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[30] Andrew R. Webb,et al. Statistical Pattern Recognition , 1999 .
[31] N. Otsu. A threshold selection method from gray level histograms , 1979 .
[32] Mohan S. Kankanhalli,et al. Multimodal fusion for multimedia analysis: a survey , 2010, Multimedia Systems.
[33] Tetsuya Takiguchi,et al. Voice activity detection by lip shape tracking using EBGM , 2007, ACM Multimedia.
[34] S. Gökhun Tanyer,et al. Voice activity detection in nonstationary noise , 2000, IEEE Trans. Speech Audio Process..
[35] Wei Zhang,et al. A soft voice activity detector based on a Laplacian-Gaussian model , 2003, IEEE Trans. Speech Audio Process..
[36] Bowon Lee,et al. Spectral entropy-based voice activity detector for videoconferencing systems , 2010, INTERSPEECH.
[37] J. Wade Davis,et al. Statistical Pattern Recognition , 2003, Technometrics.
[38] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[39] Anil K. Jain,et al. Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[40] Yoram Singer,et al. Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..
[41] Jacob Scharcanski,et al. Color-based lips extraction applied to voice activity detection , 2011, 2011 18th IEEE International Conference on Image Processing.
[42] Christian Jutten,et al. A study of lip movements during spontaneous dialog and its application to voice activity detection. , 2009, The Journal of the Acoustical Society of America.
[43] Aristodemos Pnevmatikakis,et al. Voice activity detection using audio-visual information , 2009, 2009 16th International Conference on Digital Signal Processing.
[44] Narendra Ahuja,et al. Gaussian mixture model for human skin color and its applications in image and video databases , 1998, Electronic Imaging.
[45] Wonyong Sung,et al. A voice activity detector employing soft decision based noise spectrum adaptation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).