Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior
暂无分享,去创建一个
Tatsuya Kawahara | Yoshiaki Bando | Kazuyoshi Yoshii | Kouhei Sekiguchi | Aditya Arie Nugraha | Tatsuya Kawahara | Kazuyoshi Yoshii | Yoshiaki Bando | Kouhei Sekiguchi
[1] M. Elad,et al. $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.
[2] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Chi-Kwong Li. Geometric Means , 2003 .
[4] Nobutaka Ono,et al. Stable and fast update rules for independent vector analysis based on auxiliary function technique , 2011, 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA).
[5] Alexey Ozerov,et al. Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[6] Radu Horaud,et al. Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[7] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[8] Reinhold Häb-Umbach,et al. Neural network based spectral mask estimation for acoustic beamforming , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[9] Antonio Bonafonte,et al. SEGAN: Speech Enhancement Generative Adversarial Network , 2017, INTERSPEECH.
[10] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[11] Takuya Yoshioka,et al. Relaxed disjointness based clustering for joint blind source separation and dereverberation , 2014, 2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC).
[12] Andreas Ziehe,et al. An approach to blind source separation based on temporal structure of speech signals , 2001, Neurocomputing.
[13] Hoon Kim,et al. Monte Carlo Statistical Methods , 2000, Technometrics.
[14] Zhong-Qiu Wang,et al. Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Tatsuya Kawahara,et al. Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Nancy Bertin,et al. Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis , 2009, Neural Computation.
[18] Daniel P. W. Ellis,et al. MIR_EVAL: A Transparent Implementation of Common MIR Metrics , 2014, ISMIR.
[19] Jonathan Le Roux,et al. Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks , 2016, INTERSPEECH.
[20] A. Bruckstein,et al. K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .
[21] Rémi Gribonval,et al. Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Kazuyoshi Yoshii,et al. Correlated Tensor Factorization for Audio Source Separation , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Yu Tsao,et al. Speech enhancement based on deep denoising autoencoder , 2013, INTERSPEECH.
[24] Wen-Haw Chen. A Review of Geometric Mean of Positive Definite Matrices , 2015 .
[25] Tatsuya Kawahara,et al. Bayesian Multichannel Speech Enhancement with a Deep Speech Prior , 2018, 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[26] Tatsuya Kawahara,et al. Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[27] Jon Barker,et al. The third 'CHiME' speech separation and recognition challenge: Analysis and outcomes , 2017, Comput. Speech Lang..
[28] Shinnosuke Takamichi,et al. Independent Deeply Learned Matrix Analysis for Multichannel Audio Source Separation , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[29] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[30] H. Sebastian Seung,et al. Learning the parts of objects by non-negative matrix factorization , 1999, Nature.
[31] Bhiksha Raj,et al. Speech denoising using nonnegative matrix factorization with priors , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[32] DeLiang Wang,et al. Ideal ratio mask estimation using deep neural networks for robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[33] Xavier Anguera Miró,et al. Acoustic Beamforming for Speaker Diarization of Meetings , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[34] Pierre Vandergheynst,et al. Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).
[35] Emmanuel Vincent,et al. Multichannel Audio Source Separation With Deep Neural Networks , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[36] Tatsuya Kawahara,et al. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[37] Hirokazu Kameoka,et al. Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[38] Yonghong Yan,et al. Ideal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions , 2017, INTERSPEECH.
[39] Tatsuya Kawahara,et al. Independent Low-Rank Tensor Analysis for Audio Source Separation , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).
[40] Hirokazu Kameoka,et al. Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[41] Chris Donahue,et al. Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Radu Horaud,et al. A VARIANCE MODELING FRAMEWORK BASED ON VARIATIONAL AUTOENCODERS FOR SPEECH ENHANCEMENT , 2018, 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP).
[43] Tuomas Virtanen,et al. Multichannel audio separation by direction of arrival based spatial covariance model and non-negative matrix factorization , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Zheng-Hua Tan,et al. Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification , 2017, INTERSPEECH.
[45] B.D. Van Veen,et al. Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.