论文信息 - Single-Channel Multispeaker Separation with Variational Autoencoder Spectrogram Model - 字舞流文

Single-Channel Multispeaker Separation with Variational Autoencoder Spectrogram Model

Hirokazu Kameoka | Shoji Makino | Li Li | Shogo Seki | Naoya Murashima

[1] Hirokazu Kameoka,et al. Supervised Determined Source Separation with Multichannel Variational Autoencoder , 2019, Neural Computation.

[2] Jonathan Le Roux,et al. Discriminative NMF and its application to single-channel source separation , 2014, INTERSPEECH.

[3] Shinnosuke Takamichi,et al. Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[4] Hirokazu Kameoka,et al. Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation , 2010, LVA/ICA.

[5] Rémi Gribonval,et al. Underdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling , 2009, ICA.

[6] Li Li,et al. Fast MVAE: Joint Separation and Classification of Mixed Sources Based on Multichannel Variational Autoencoder with Auxiliary Classifier , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.

[8] J. Cardoso,et al. Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[9] Zhuo Chen,et al. Deep clustering: Discriminative embeddings for segmentation and separation , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[11] Radu Horaud,et al. Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[13] Li Li,et al. Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation , 2019, 2019 27th European Signal Processing Conference (EUSIPCO).

[14] Emmanuel Vincent,et al. Multichannel Audio Source Separation With Deep Neural Networks , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15] Jonathan Le Roux,et al. SDR – Half-baked or Well Done? , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16] DeLiang Wang,et al. Divide and Conquer: A Deep CASA Approach to Talker-Independent Monaural Speaker Separation , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17] John R. Hershey,et al. Phasebook and Friends: Leveraging Discrete Representations for Source Separation , 2018, IEEE Journal of Selected Topics in Signal Processing.

[18] Radu Horaud,et al. A VARIANCE MODELING FRAMEWORK BASED ON VARIATIONAL AUTOENCODERS FOR SPEECH ENHANCEMENT , 2018, 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP).

[19] Bhiksha Raj,et al. Supervised and Semi-supervised Separation of Sounds from Single-Channel Mixtures , 2007, ICA.

[20] Tatsuya Kawahara,et al. Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21] Hirokazu Kameoka,et al. Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[22] Max Welling,et al. Semi-supervised Learning with Deep Generative Models , 2014, NIPS.