Dilated convolutional recurrent neural network for monaural speech enhancement
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Herman J. M. Steeneken,et al. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..
[3] Shadi Pirhosseinloo,et al. A new feature set for masking-based monaural speech separation , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.
[4] Kostas Kokkinakis,et al. An Interaural Magnification Algorithm for Enhancement of Naturally-Occurring Level Differences , 2016, INTERSPEECH.
[5] DeLiang Wang,et al. Long short-term memory for speaker generalization in supervised speech separation. , 2017, The Journal of the Acoustical Society of America.
[6] DeLiang Wang,et al. A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement , 2018, INTERSPEECH.
[7] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[10] Sh Pirhosseinloo. A Combination of Maximum Likelihood Bayesian Framework and Discriminative Linear Transforms for Speaker Adaptation , 2012 .
[11] Tim Brookes,et al. Dynamic Precedence Effect Modeling for Source Separation in Reverberant Environments , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Kostas Kokkinakis,et al. Time-Frequency Masking for Blind Source Separation with Preserved Spatial Cues , 2017, INTERSPEECH.
[13] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[14] DeLiang Wang,et al. On Training Targets for Supervised Speech Separation , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[15] DeLiang Wang,et al. On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis , 2005, Speech Separation by Humans and Machines.
[16] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[17] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[18] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[19] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[20] DeLiang Wang,et al. Gated Residual Networks With Dilated Convolutions for Monaural Speech Enhancement , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Nobutaka Ito,et al. The Diverse Environments Multi-channel Acoustic Noise Database (DEMAND): A database of multichannel environmental noise recordings , 2013 .
[22] Shadi Pirhosseinloo,et al. Discriminative speaker adaptation in Persian continuous speech recognition systems , 2012 .
[23] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[24] Shadi Pirhosseinloo,et al. Monaural Speech Enhancement with Dilated Convolutions , 2019, INTERSPEECH.
[25] Jont B. Allen,et al. Image method for efficiently simulating small‐room acoustics , 1976 .
[26] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[27] DeLiang Wang,et al. Learning spectral mapping for speech dereverberation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).