Can we Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech?—A Dataset, Insights, and Challenges
暂无分享,去创建一个
[1] Pascal Scalart,et al. Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[2] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[3] Jon Barker,et al. The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[5] Emmanuel Vincent,et al. Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Alexander U. Case. Sound FX: Unlocking the Creative Potential of Recording Studio Effects , 2007 .
[7] Yi Hu,et al. Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Willem Bastiaan Kleijn,et al. Bandwidth expansion of speech based on vector quantization of the mel frequency cepstral coefficients , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).
[9] Bob Katz,et al. Mastering Audio: The Art and the Science , 2002 .
[10] Gautham J. Mysore,et al. Language informed bandwidth expansion , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.
[11] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[12] Bobby Owsinski. The Recording Engineer's Handbook , 2004 .
[13] Gautham J. Mysore,et al. Speaker and noise independent voice activity detection , 2013, INTERSPEECH.
[14] Bobby Owsinski. The Mixing Engineer's Handbook , 1999 .
[15] Patrick A. Naylor,et al. Speech Dereverberation , 2010 .
[16] Paris Smaragdis,et al. Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments , 2012, INTERSPEECH.
[17] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[18] Naveen Parihar,et al. Performance analysis of the Aurora large vocabulary baseline system , 2004, 2004 12th European Signal Processing Conference.
[19] Udo Zölzer,et al. Adaptive digital audio effects (a-DAFx): a new class of sound transformations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[20] J. Scolapio,et al. The art and science. , 2003, Nutrition in clinical practice : official publication of the American Society for Parenteral and Enteral Nutrition.
[21] Daniel P. W. Ellis,et al. Speech decoloration based on the product-of-filters model , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..
[23] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[25] Joshua D. Reiss,et al. Digital Dynamic Range Compressor Design—A Tutorial and Analysis , 2012 .
[26] Tomohiro Nakatani,et al. The reverb challenge: A common evaluation framework for dereverberation and recognition of reverberant speech , 2013, 2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.
[27] Joshua D. Reiss,et al. Parameter Automation in a Dynamic Range Compressor , 2013 .
[28] Tamar Frankel. [The theory and the practice...]. , 2001, Tijdschrift voor diergeneeskunde.