Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home
暂无分享,去创建一个
Tara N. Sainath | Arun Narayanan | Michiel Bacchiani | Thad Hughes | K. K. Chin | Chanwoo Kim | Ananya Misra | Kean K. Chin | T. Sainath | M. Bacchiani | Chanwoo Kim | Ananya Misra | Thad Hughes | A. Narayanan
[1] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[2] Richard M. Stern,et al. Physiologically-motivated synchrony-based processing for robust automatic speech recognition , 2006, INTERSPEECH.
[3] Eric A. Lehmann,et al. Reverberation-Time Prediction Method for Room Impulse Responses Simulated with the Image-Source Model , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[4] John H. L. Hansen,et al. A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition , 2008, Speech Commun..
[5] Hyung-Min Park,et al. Binaural and Multiple-Microphone Signal Processing Motivated by Auditory Perception , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.
[6] Richard M. Stern,et al. Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction , 2009, INTERSPEECH.
[7] Richard M. Stern,et al. Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain , 2009, INTERSPEECH.
[8] Richard M. Stern,et al. Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[9] Richard M. Stern,et al. Nonlinear enhancement of onset for robust speech recognition , 2010, INTERSPEECH.
[10] Richard M. Stern,et al. Binaural sound source separation motivated by auditory processing , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[11] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[12] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[13] Richard M. Stern,et al. Two-microphone source separation algorithm based on statistical modeling of angle distributions , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[14] Yongqiang Wang,et al. An investigation of deep neural networks for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Richard M. Stern,et al. Robust speech recognition using temporal masking and thresholding algorithm , 2014, INTERSPEECH.
[17] Chanwoo Kim,et al. Sound source separation algorithm using phase difference and angle distribution modeling near the target , 2015, INTERSPEECH.
[18] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[19] Richard M. Stern,et al. Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[20] Tara N. Sainath,et al. Factored spatial and spectral multichannel raw waveform CLDNNs , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Tara N. Sainath,et al. Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Mitch Weintraub,et al. Acoustic Modeling for Google Home , 2017, INTERSPEECH.
[23] Richard M. Stern,et al. Robust Speech Recognition Based on Binaural Auditory Processing , 2017, INTERSPEECH.
[24] Tara N. Sainath,et al. Raw Multichannel Processing Using Deep Neural Networks , 2017, New Era for Robust Speech Recognition, Exploiting Deep Learning.