Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition
暂无分享,去创建一个
Tomohiro Nakatani | Shoko Araki | Masahiro Okada | Atsunori Ogawa | Takuya Higuchi | A. Ogawa | T. Nakatani | T. Higuchi | S. Araki | M. Okada
[1] S. Furui,et al. A JAPANESE NATIONAL PROJECT ON SPONTANEOUS SPEECH CORPUS AND PROCESSING TECHNOLOGY , 2003 .
[2] Futoshi Asano,et al. Detection and Separation of Speech Events in Meeting Recordings Using a Microphone Array , 2007, EURASIP J. Audio Speech Music. Process..
[3] Gökhan Tür,et al. The CALO meeting speech recognition and understanding system , 2008, 2008 IEEE Spoken Language Technology Workshop.
[4] Chengzhu Yu,et al. The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[5] Alex Waibel,et al. MEETING BROWSER: TRACKING AND SUMMARIZING MEETINGS , 2007 .
[6] Takuya Yoshioka,et al. Robust MVDR beamforming using time-frequency masks for online/offline ASR in noise , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Andreas Stolcke,et al. The ICSI Meeting Project: Resources and Research , 2004 .
[8] Tomohiro Nakatani,et al. Is speech enhancement pre-processing still relevant when using deep neural networks for acoustic modeling? , 2013, INTERSPEECH.
[9] Rémi Gribonval,et al. Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Thomas Hain,et al. Recognition and understanding of meetings the AMI and AMIDA projects , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).
[11] Mike Flynn,et al. Browsing Recorded Meetings with Ferret , 2004, MLMI.
[12] Takuya Yoshioka,et al. Relaxed disjointness based clustering for joint blind source separation and dereverberation , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).
[13] Masahito Togami,et al. Optimized Speech Dereverberation From Probabilistic Perspective for Time Varying Acoustic Transfer Function , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[15] Tomohiro Nakatani,et al. Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Masakiyo Fujimoto,et al. Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Marijn Huijbregts,et al. The ICSI RT07s Speaker Diarization System , 2007, CLEAR.