基於聽覺感知模型之類神經網路及其在語者識別上之應用 (Two-stage Attentional Auditory Model Inspired Neural Network and Its Application to Speaker Identification) [In Chinese]
暂无分享,去创建一个
Tai-Shih Chi | Yuan-Fu Liao | Yu-Wen Lo | T. Chi | Y. Liao | Yu-Wen Lo
[1] Zhong-Qiu Wang,et al. Robust speech recognition from ratio masks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Mounya Elhilali,et al. Monkey Frequency-Modulation Encoding in the Primary Auditory Cortex of the Awake Owl , 2001 .
[3] Mounya Elhilali,et al. A spectro-temporal modulation index (STMI) for assessment of speech intelligibility , 2003, Speech Commun..
[4] Powen Ru,et al. Multiresolution spectrotemporal analysis of complex sounds. , 2005, The Journal of the Acoustical Society of America.
[5] R. Fay,et al. Auditory perception of sound sources , 2007 .
[6] Frederick Z. Yen,et al. Singing Voice Separation Using Spectro-Temporal Modulation Features , 2014, ISMIR.
[7] Jagannath H. Nirmal,et al. A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network , 2015, 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR).
[8] Tai-Shih Chi,et al. Multiband analysis and synthesis of spectro-temporal modulations of Fourier spectrogram. , 2011, The Journal of the Acoustical Society of America.
[9] Jean-Luc Schwartz,et al. An information theoretical investigation into the distribution of phonetic information across the auditory spectrogram , 1993, Comput. Speech Lang..
[10] DeLiang Wang,et al. Complex Ratio Masking for Monaural Speech Separation , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[12] B C Moore,et al. Perceptual consequences of cochlear hearing loss and their implications for the design of hearing aids. , 1996, Ear and hearing.
[13] Yi Wang,et al. Speaker recognition based on MFCC and BP neural networks , 2017, 2017 28th Irish Signals and Systems Conference (ISSC).
[14] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[15] Johan Schalkwyk,et al. Learning acoustic frame labeling for speech recognition with recurrent neural networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Jürgen Schmidhuber,et al. Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.
[17] L. Humes,et al. Speech-recognition difficulties of the hearing-impaired elderly: the contributions of audibility. , 1990, Journal of speech and hearing research.
[18] Yi-Cheng Chen,et al. Spectro-temporal modulation based singing detection combined with pitch-based grouping for singing voice separation , 2013, INTERSPEECH.
[19] DeLiang Wang,et al. Deep neural networks for cochannel speaker identification , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[21] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.
[22] Brian C J Moore,et al. Effect of enhancement of spectral changes on speech intelligibility and clarity preferences for the hearing impaired. , 2012, The Journal of the Acoustical Society of America.
[23] Zhong-Qiu Wang,et al. A Joint Training Framework for Robust Automatic Speech Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] S. David,et al. Auditory attention : focusing the searchlight on sound , 2007 .
[25] Ron J. Weiss,et al. Speech acoustic modeling from raw multichannel waveforms , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Gerald Penn,et al. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Tai-Shih Chi,et al. Spectro-temporal modulation energy based mask for robust speaker identification. , 2012, The Journal of the Acoustical Society of America.
[28] Tai-Shih Chi,et al. Spectro-temporal modulations for robust speech emotion recognition , 2010, INTERSPEECH.
[29] Tara N. Sainath,et al. Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).