DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech
暂无分享,去创建一个
Chin-Hui Lee | Yong Xu | Zhen Huang | Kehuang Li | Chin-Hui Lee | Zhen Huang | Yong Xu | Kehuang Li
[1] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..
[2] Masashi Unoki,et al. Robust voice activity detection based on concept of modulation transfer function in noisy reverberant environments , 2014, ISCSLP.
[3] Qin Yan,et al. Speech Bandwidth Extension: Extrapolations of Spectral Envelop and Harmonicity Quality of Excitation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[4] Rich Caruana,et al. Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.
[5] Jaap C. Haartsen,et al. The Bluetooth radio system , 2000, IEEE Personal Communications.
[6] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[7] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[8] Schuyler Quackenbush,et al. Objective measures of speech quality , 1995 .
[9] Roberto Pieraccini,et al. Where do we go from here? Research and Commercial Spoken Dialog Systems , 2005, SIGDIAL.
[10] B. Schneirdeman,et al. Designing the User Interface: Strategies for Effective Human-Computer Interaction , 1998 .
[11] Mark A. Clements,et al. Sparse probabilistic state mapping and its application to speech bandwidth expansion , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[12] Hyung Soon Kim,et al. Narrowband to wideband conversion of speech using GMM based transformation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[13] S. Joy Mountford,et al. The Art of Human-Computer Interface Design , 1990 .
[14] Victor Zue,et al. JUPlTER: a telephone-based conversational interface for weather information , 2000, IEEE Trans. Speech Audio Process..
[15] Gerhard Schmidt,et al. Neural networks versus codebooks in an application for bandwidth extension of speech signals , 2003, INTERSPEECH.
[16] Geun-Bae Song,et al. A study of HMM-based bandwidth extension of speech signals , 2009, Signal Process..
[17] C. Marvin. When Old Technologies Were New , 2010 .
[18] Gautham J. Mysore,et al. Language informed bandwidth expansion , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.
[19] Juan Manuel Górriz,et al. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness , 2007 .
[20] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Chin-Hui Lee,et al. A deep neural network approach to speech bandwidth expansion , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Mei-Yuh Hwang,et al. Shared-distribution hidden Markov models for speech recognition , 1993, IEEE Trans. Speech Audio Process..
[23] Jacob Benesty,et al. Spectral Enhancement Methods , 2009 .
[24] L. R. Rabiner,et al. An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.
[25] Kuldip K. Paliwal,et al. Automatic Speech and Speaker Recognition: Advanced Topics , 1999 .
[26] Frank K. Soong,et al. A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] Yoshihisa Nakatoh,et al. Generation of broadband speech from narrowband speech based on linear mapping , 2002 .
[28] Ohad Shamir,et al. Optimal Distributed Online Prediction , 2011, ICML.
[29] Biing-Hwang Juang,et al. Recurrent deep neural networks for robust speech recognition , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..
[31] Julien Epps,et al. A new technique for wideband enhancement of coded narrowband speech , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).
[32] Gerhard Schmidt,et al. Bandwidth Extension of Telephony Speech , 2008 .
[33] Geoffrey E. Hinton,et al. A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..
[34] Vladimir Pavlovic,et al. Toward multimodal human-computer interface , 1998, Proc. IEEE.
[35] Hermann Ney,et al. Computing Mel-frequency cepstral coefficients on the power spectrum , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[36] Geoffrey E. Hinton. A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.
[37] D. M. Allen. Mean Square Error of Prediction as a Criterion for Selecting Variables , 1971 .
[38] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.