暂无分享,去创建一个
Xiong Xiao | Hakan Erdogan | Takuya Yoshioka | Zhuo Chen | Fil Alleva | Hakan Erdogan | F. Alleva | Zhuo Chen | Xiong Xiao | Takuya Yoshioka
[1] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Geoffrey Zweig,et al. Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.
[3] Mitch Weintraub,et al. Acoustic Modeling for Google Home , 2017, INTERSPEECH.
[4] Steve Renals,et al. Distant Speech Recognition Experiments Using the AMI Corpus , 2017, New Era for Robust Speech Recognition, Exploiting Deep Learning.
[5] Tomohiro Nakatani,et al. Learning speaker representation for neural network based multichannel speaker extraction , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[6] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[7] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[8] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Jinyu Li,et al. Progressive Joint Modeling in Unsupervised Single-Channel Overlapped Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Reinhold Häb-Umbach,et al. Beamnet: End-to-end training of a beamformer-supported multi-channel ASR system , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] David Suendermann-Oeft,et al. Medical Speech Recognition: Reaching Parity with Humans , 2017, SPECOM.
[12] Chengzhu Yu,et al. The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[13] Geoffrey Zweig,et al. Deep bi-directional recurrent networks over spectral windows , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[14] Hakan Erdogan,et al. Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Jonathan G. Fiscus,et al. Multiple Dimension Levenshtein Edit Distance Calculations for Evaluating Automatic Speech Recognition Systems During Simultaneous Speech , 2006, LREC.
[16] Reinhold Häb-Umbach,et al. BLSTM supported GEV beamformer front-end for the 3RD CHiME challenge , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[17] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[18] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[19] Reinhold Häb-Umbach,et al. Tight Integration of Spatial and Spectral Features for BSS with Deep Clustering Embeddings , 2017, INTERSPEECH.
[20] Lukás Burget,et al. Transcribing Meetings With the AMIDA Systems , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Takuya Yoshioka,et al. Exploring Practical Aspects of Neural Mask-Based Beamforming for Far-Field Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Jonathan Le Roux,et al. Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks , 2016, INTERSPEECH.
[23] Elizabeth Shriberg,et al. Analysis of overlaps in meetings by dialog factors, hot spots, speakers, and collection site: insights for automatic speech recognition , 2006, INTERSPEECH.
[24] Jonathan G. Fiscus,et al. The Rich Transcription 2005 Spring Meeting Recognition Evaluation , 2005, MLMI.
[25] Dong Yu,et al. Multitalker Speech Separation With Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Navdeep Jaitly,et al. Speech recognition for medical conversations , 2017, INTERSPEECH.
[27] DeLiang Wang,et al. Binaural Reverberant Speech Separation Based on Deep Neural Networks , 2017, INTERSPEECH.
[28] Jacob Benesty,et al. On Optimal Frequency-Domain Multichannel Linear Filtering for Noise Reduction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Masakiyo Fujimoto,et al. Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Tomohiro Nakatani,et al. Generalization of Multi-Channel Linear Prediction Methods for Blind MIMO Impulse Response Shortening , 2012, IEEE Transactions on Audio, Speech, and Language Processing.