Using Deep Speech Recognition to Evaluate Speech Enhancement Methods
暂无分享,去创建一个
Ravi P. Ramachandran | Ghulam Rasool | Nidhal C. Bouaynaya | Shamoon Siddiqui | G. Rasool | N. Bouaynaya | R. Ramachandran | Shamoon Siddiqui
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[3] A.V. Oppenheim,et al. Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.
[4] Stefan Winkler,et al. Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives , 2016, Multimedia Systems.
[5] Jorge Herbert de Lira,et al. Two-Dimensional Signal and Image Processing , 1989 .
[7] Angel Manuel Gomez,et al. A Deep Learning Loss Function Based on the Perceptual Evaluation of the Speech Quality , 2018, IEEE Signal Processing Letters.
[8] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[9] Ravi P. Ramachandran,et al. Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion , 2017, J. Signal Process. Syst..
[10] Timo Aila,et al. A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Jyh-Shing Roger Jang,et al. SVSGAN: Singing Voice Separation Via Generative Adversarial Network , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] David Miller,et al. The Fisher Corpus: a Resource for the Next Generations of Speech-to-Text , 2004, LREC.
[13] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[14] Roberto Togneri,et al. A Primer on Deep Learning Architectures and Applications in Speech Processing , 2019, Circuits, Systems, and Signal Processing.
[15] Amos J. Storkey,et al. Data Augmentation Generative Adversarial Networks , 2017, ICLR 2018.
[16] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Weiping Zhu,et al. Recent Developments in Speech Enhancement in the Short-Time Fourier Transform Domain , 2016, IEEE Circuits and Systems Magazine.
[18] Chris Donahue,et al. Adversarial Audio Synthesis , 2018, ICLR.
[19] Paris Smaragdis,et al. Generative Adversarial Source Separation , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[20] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[21] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[22] Philipos C. Loizou,et al. Speech Enhancement: Theory and Practice , 2007 .
[23] Kumar Krishna Agrawal,et al. GANSynth: Adversarial Neural Audio Synthesis , 2019, ICLR.
[24] Yi Hu,et al. Subjective Comparison of Speech Enhancement Algorithms , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[25] Harris Drucker. Speech processing in a high ambient noise environment , 1967 .
[26] W. B. Kleijn,et al. Speech Enhancement with Variance Constrained Autoencoders , 2019, INTERSPEECH.
[27] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[28] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[29] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .
[30] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[31] Yi Hu,et al. Evaluation of Objective Quality Measures for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[32] M. A. A. El-Fattah,et al. Speech Enhancement Using an Adaptive Wiener Filtering Approach , 2008 .
[33] H.G. De Meer,et al. Utility curves: mean opinion scores considered biased , 1999, 1999 Seventh International Workshop on Quality of Service. IWQoS'99. (Cat. No.98EX354).
[34] Herman J. M. Steeneken,et al. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..
[35] Radu Horaud,et al. Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Xavier Serra,et al. A Wavenet for Speech Denoising , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Björn W. Schuller,et al. Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement , 2009, EURASIP J. Audio Speech Music. Process..
[38] Norbert Wiener,et al. Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .
[39] L.L. Beranek,et al. The Design of Speech Communication Systems , 1947, Proceedings of the IRE.