Speech Enhancement via Attention Masking Network (SEAMNET): An End-to-End System for Joint Suppression of Noise and Reverberation
暂无分享,去创建一个
[1] Abeer Alwan,et al. Log-spectral amplitude estimation with Generalized Gamma distributions for speech enhancement , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Tao Zhang,et al. Learning Spectral Mapping for Speech Dereverberation and Denoising , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] Israel Cohen,et al. Speech enhancement using a noncausal a priori SNR estimator , 2004, IEEE Signal Processing Letters.
[4] Björn W. Schuller,et al. Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR , 2015, LVA/ICA.
[5] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[6] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[7] Jürgen Schmidhuber,et al. Deep learning in neural networks: An overview , 2014, Neural Networks.
[8] Nima Mesgarani,et al. Real-time Single-channel Dereverberation and Separation with Time-domain Audio Separation Network , 2018, INTERSPEECH.
[9] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[10] Rainer Martin,et al. SPEECH ENHANCEMENT IN THE DFT DOMAIN USING LAPLACIAN SPEECH PRIORS , 2003 .
[11] A. Alwan,et al. A Unified Framework for Designing Optimal STSA Estimators Assuming Maximum Likelihood Phase Equivalence of Speech and Noise , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[12] N. Jayant. Digital coding of speech waveforms: PCM, DPCM, and DM quantizers , 1974 .
[13] Daniel Povey,et al. MUSAN: A Music, Speech, and Noise Corpus , 2015, ArXiv.
[14] Jae S. Lim,et al. The unimportance of phase in speech enhancement , 1982 .
[15] Tillman Weyde,et al. Improved Speech Enhancement with the Wave-U-Net , 2018, ArXiv.
[16] Yonghong Yan,et al. Improving generative adversarial networks for speech enhancement through regularization of latent representations , 2020, Speech Commun..
[17] I. Cohen. Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator , 2002, IEEE Signal Processing Letters.
[18] Jun Du,et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks , 2014, IEEE Signal Processing Letters.
[19] Israel Cohen,et al. Simultaneous Detection and Estimation Approach for Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[20] Eric Plourde,et al. Auditory-Based Spectral Amplitude Estimators for Speech Enhancement , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[22] Yu Tsao,et al. Learning With Learned Loss Function: Speech Enhancement With Quality-Net to Improve Perceptual Evaluation of Speech Quality , 2019, IEEE Signal Processing Letters.
[23] Jesper Jensen,et al. On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Yi Hu,et al. Subjective comparison and evaluation of speech enhancement algorithms , 2007, Speech Commun..
[25] R. McAulay,et al. Speech enhancement using a soft-decision noise suppression filter , 1980 .
[26] Emanuel A. P. Habets,et al. Late Reverberant Spectral Variance Estimation Based on a Statistical Model , 2009, IEEE Signal Processing Letters.
[27] John H. L. Hansen,et al. Blind Spectral Weighting for Robust Speaker Identification under Reverberation Mismatch , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[29] S. Ciochina,et al. Speech enhancement using spectral over-subtraction and residual noise reduction , 2003, Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on.
[30] Martin Vetterli,et al. Wavelets and filter banks: theory and design , 1992, IEEE Trans. Signal Process..
[31] Tran Huy Dat,et al. Generalized gamma modeling of speech and its online estimation for speech enhancement , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[32] Olivier Cappé,et al. Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor , 1994, IEEE Trans. Speech Audio Process..
[33] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[34] Tomohiro Nakatani,et al. Suppression of Late Reverberation Effect on Speech Signal Using Long-Term Multiple-step Linear Prediction , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[35] Jesper Jensen,et al. Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[36] Junichi Yamagishi,et al. Speech Enhancement of Noisy and Reverberant Speech for Text-to-Speech , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[37] Philipos C. Loizou,et al. Speech enhancement based on perceptually motivated bayesian estimators of the magnitude spectrum , 2005, IEEE Transactions on Speech and Audio Processing.
[38] Vladlen Koltun,et al. Speech Denoising with Deep Feature Losses , 2018, INTERSPEECH.
[39] Ulpu Remes,et al. Techniques for Noise Robustness in Automatic Speech Recognition , 2012 .
[40] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[41] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Emmanuel Vincent,et al. A French Corpus for Distant-Microphone Speech Processing in Real Homes , 2016, INTERSPEECH.
[43] Bin Chen,et al. A Laplacian-based MMSE estimator for speech enhancement , 2007, Speech Commun..
[44] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[45] Paris Smaragdis,et al. End-To-End Source Separation With Adaptive Front-Ends , 2017, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.
[46] Jesper Jensen,et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[47] David A. van Leeuwen,et al. The effect of noise on modern automatic speaker recognition systems , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[49] Pascal Scalart,et al. Speech enhancement based on a priori signal to noise estimation , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[50] Rainer Martin,et al. Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..
[51] A. Zekveld,et al. Cognitive Load During Speech Perception in Noise: The Influence of Age, Hearing Loss, and Cognition on the Pupil Response , 2011, Ear and hearing.
[52] Bengt J. Borgstrom,et al. The linear prediction inverse modulation transfer function (LP-IMTF) filter for spectral enhancement, with applications to speaker recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[53] Olli Viikki,et al. Cepstral domain segmental feature vector normalization for noise robust speech recognition , 1998, Speech Commun..
[54] Yu Tsao,et al. End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[55] Björn W. Schuller,et al. Discriminatively trained recurrent neural networks for single-channel speech separation , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[56] Rainer Martin,et al. Cepstral Smoothing of Spectral Filter Gains for Speech Enhancement Without Musical Noise , 2007, IEEE Signal Processing Letters.
[57] Philipos C. Loizou,et al. Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[58] DeLiang Wang,et al. A New Framework for Supervised Speech Enhancement in the Time Domain , 2018, INTERSPEECH.
[59] Umut Isik,et al. Attention Wave-U-Net for Speech Enhancement , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[60] Sanjeev Khudanpur,et al. X-Vectors: Robust DNN Embeddings for Speaker Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[61] Tao Zhang,et al. Perceptually Guided Speech Enhancement Using Deep Neural Networks , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[62] Junichi Yamagishi,et al. Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech , 2016, SSW.
[63] Pejman Mowlaee,et al. Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[64] Tim Fingscheidt,et al. A Perceptual Weighting Filter Loss for DNN Training In Speech Enhancement , 2019, 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[65] Robert B. Dunn,et al. Improving Statistical Model-Based Speech Enhancement with Deep Neural Networks , 2018, 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).
[66] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..
[67] Nima Mesgarani,et al. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[68] Israel Cohen,et al. Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..
[69] Li-Rong Dai,et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.