暂无分享,去创建一个
Joon Son Chung | Andrew Zisserman | Triantafyllos Afouras | Andrew Zisserman | Triantafyllos Afouras
[1] Jae Lim,et al. Signal estimation from modified short-time Fourier transform , 1984 .
[2] J L Schwartz,et al. Audio-visual enhancement of speech in noise. , 2001, The Journal of the Acoustical Society of America.
[3] John R. Hershey,et al. Audio-Visual Sound Separation Via Hidden Markov Models , 2001, NIPS.
[4] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[5] Chalapathy Neti,et al. Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization) , 2002, Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2002.
[6] Chalapathy Neti,et al. Noisy audio feature enhancement using audio-visual speech data , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[7] Nebojsa Jojic,et al. Audio-visual graphical models for speech processing , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[8] Saeid Sanei,et al. Video assisted speech source separation , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[9] Richard M. Dansereau,et al. Single-Channel Speech Separation Using Soft Mask Filtering , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Bhiksha Raj,et al. Soft Mask Methods for Single-Channel Speaker Separation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Te-Won Lee,et al. Blind Speech Separation , 2007, Blind Speech Separation.
[12] Ben P. Milner,et al. Using audio-visual features for robust voice activity detection in clean and noisy speech , 2008, 2008 16th European Signal Processing Conference.
[13] DeLiang Wang,et al. A Supervised Learning Approach to Monaural Segregation of Reverberant Speech , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Ben P. Milner,et al. Effective visually-derived Wiener filtering for audio-visual speech processing , 2009, AVSP.
[15] M. A. Anusuya,et al. Speech Recognition by Machine, A Review , 2010, ArXiv.
[16] Emmanuel Vincent,et al. Subjective and Objective Quality Assessment of Audio Source Separation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[17] Jesper Jensen,et al. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[18] Josef Kittler,et al. Source Separation of Convolutive and Noisy Mixtures Using Audio-Visual Dictionary Learning and Probabilistic Time-Frequency Masking , 2013, IEEE Transactions on Signal Processing.
[19] Faheem Khan,et al. Speaker separation using visually-derived binary masks , 2013, AVSP.
[20] Jonathon A. Chambers,et al. Audiovisual Speech Source Separation: An overview of key methodologies , 2014, IEEE Signal Processing Magazine.
[21] Pejman Mowlaee,et al. Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[22] Andreas Gaich,et al. On speech intelligibility estimation of phase-aware single-channel speech enhancement , 2015, INTERSPEECH.
[23] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[24] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] DeLiang Wang,et al. Complex ratio masking for joint enhancement of magnitude and phase , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[27] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[28] Franz Pernkopf,et al. Phase-Aware Signal Processing for Automatic Speech Recognition , 2016, INTERSPEECH.
[29] Joon Son Chung,et al. Out of Time: Automated Lip Sync in the Wild , 2016, ACCV Workshops.
[30] Yannis Stylianou,et al. Advances in phase-aware signal processing in speech communication , 2016, Speech Commun..
[31] Shimon Whiteson,et al. LipNet: Sentence-level Lipreading , 2016, ArXiv.
[32] Yu Tsao,et al. Complex spectrogram enhancement by convolutional neural network with multi-metrics learning , 2017, 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP).
[33] Shmuel Peleg,et al. Improved Speech Reconstruction from Silent Video , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[34] Themos Stafylakis,et al. Combining Residual Networks with LSTMs for Lipreading , 2017, INTERSPEECH.
[35] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Ira Kemelmacher-Shlizerman,et al. Synthesizing Obama , 2017, ACM Trans. Graph..
[37] Michael Gref,et al. On the Influence of Modifying Magnitude and Phase Spectrum to Enhance Noisy Speech Signals , 2017, INTERSPEECH.
[38] Joon Son Chung,et al. Lip Reading in Profile , 2017, BMVC.
[39] Yu Tsao,et al. Multi-Metrics Learning for Speech Enhancement , 2017, ArXiv.
[40] Yu Tsao,et al. Audio-Visual Speech Enhancement based on Multimodal Deep Convolutional Neural Network , 2017, ArXiv.
[41] Garrett T. Kenyon,et al. Does Phase Matter For Monaural Source Separation? , 2017, ArXiv.
[42] Joon Son Chung,et al. You said that? , 2017, BMVC.
[43] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[44] Shmuel Peleg,et al. Visual Speech Enhancement using Noise-Invariant Training , 2017, ArXiv.
[45] Andrew Owens,et al. Audio-Visual Scene Analysis with Self-Supervised Multisensory Features , 2018, ECCV.
[46] Kevin Wilson,et al. Looking to listen at the cocktail party , 2018, ACM Trans. Graph..
[47] Maja Pantic,et al. End-to-End Audiovisual Speech Recognition , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] Joon Son Chung,et al. VoxCeleb2: Deep Speaker Recognition , 2018, INTERSPEECH.
[49] DeLiang Wang,et al. Supervised Speech Separation Based on Deep Learning: An Overview , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[50] Yu Tsao,et al. Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks , 2017, IEEE Transactions on Emerging Topics in Computational Intelligence.
[51] Shmuel Peleg,et al. Seeing Through Noise: Visually Driven Speaker Separation And Enhancement , 2017, ICASSP.
[52] Shmuel Peleg,et al. Visual Speech Enhancement , 2017, INTERSPEECH.